ECMA-407
ECMA-407 is the world's first approved international 3D audio standard for the unrestricted delivery of channel-based, object-based and scene-based signals up to NHK 22.2 developed by Ecma TC32-TG22 in close cooperation with France Télévisions, Radio France, École Polytechnique Fédérale de Lausanne and McGill University in Montreal.
International standard ECMA-407 as approved in June 2014. | |
Type | International standard |
---|---|
Legal status | Approved (June 25, 2014) |
Purpose | Multichannel compression |
Headquarters | Geneva |
Region served | Worldwide |
Convenor TG22 | Mr Clemens Par |
Parent organization | Ecma International |
Website | www |
ECMA-407 uses inverse coding in the time domain, an invention by the Swiss-Austrian mathematician Clemens Par, and shows lowest spatial bitrates ever achieved (for instance, several minutes of NHK 22.2 may be represented by an encapsulated data package of 100 bytes). It was chosen for the World Intellectual Property Organization Award in 2009.[1] Inverse coding goes back to Victor Ambartsumian's scientific legacy on inverse problems and represents the first solution of its kind in audio by separating sound sources at the same frequency by a time-level model.[2]
The Ecma S5 standards family
S5 ("Scalable Sparse Spatial Sound System") is a scalable multichannel coding system, which may incorporate a wide range of base audio codecs, preferably with additional encapsulation capacity for external data, e.g. MPEG-4 or MPEG-D. Compatible bit stream syntax may thus be maintained, and ECMA-407 becomes “invisible” during transmission.
An S5 codec can be determined by the functional block diagrams of the S5 encoder and of the S5 decoder. According to the standard, S5 encoder shall at least consist of a base S5 encoder; and a S5 decoder shall at least consist of a base S5 decoder.
Base S5 System
The base S5 encoder compresses multichannel audio information by downmixing the f-channel signal to g channels and produces sparse spatial data according to an inverse coding model, which approaches the localization and ambiance of the original signal. The inverse coding model directly constructs an upmix of h channels from the audio downmix and its associated spatial data.
Compressing the downmix audio by a base audio coder further increases the coding efficiency of S5. The various bitstreams produced by the functional units of an S5 encoder may be encapsulated into a single bitstream by the functional unit 'Multiplexer'.
Ancillary data may be conveyed from the S5 encoder to the S5 decoder. This ancillary data may be used to encapsulate data other than coding parameters, for example, loudness parameters according to ITU-R or the European Broadcasting Union, which may be used to adjust the perceived level of audio signals.
ECMA-407 specifies the base S5 encoder/decoder and their interfaces only. All other components and their interfaces are not specified, as such specific S5 codecs are subject to separate standards. As the base S5 encoder and the base S5 decoder are agnostic to the other system components, EMCA-407 represents the common base for all S5 specific standards.
S5 Conformance
ECMA-407 specifies the base S5 encoder and decoder in terms of configuration data, downmix, inverse coding parameter data and upmix. In addition it provides reference and guidance on how to incorporate further components to form a scalable multichannel coding system for audio data compression.
S5 conformance is established by its specified dataflows.
S5 Signal Analysis and Invariant Theory
S5 'Signal Analysis' is not specified. It may be either based on statistical methods, which require extensive computational power, or on the discovery of algebraic invariants with Gaussian processes by Clemens Par in 2010 (after having been averted to this classical problem by Rudolf E. Kálmán), based on German mathematician David Hilbert's published proof of the invariant field in 1893 and the apolarity behavior of algebraic cones as extensively studied by Grace and Young in 1903.[3][4][5][6]
Computational complexity can be easily eliminated, when basing S5 signal analysis on algebraic invariants.
S5 Extensions
Inverse coding in time domain finds its pendant with inverse coding in frequency domain, another invention of Clemens Par in 2013 and 2014, which unlike parametric coding requires no side information at all. This upmix in the Fourier field may double the number of output channels without altering the bitstream syntax.
S5 Loudness and Electronic Fingerprint
ECMA-407 uses modern tools such as loudness according to ITU-R or the European Broadcasting Union and electronic fingerprint that allow internal and external synchronization, for example, with video, with additional services or with "second screens".
First ECMA-407 Implementation and Broadcasts
ECMA-407 was first implemented by Swissaudec in 2014 in co-operation with Marco Mattavelli's research group at École Polytechnique Fédérale de Lausanne and was tuned by Swissaudec and Wieslaw Woszczyk’s research group at McGill University’s “Studio 22”.
Ecma S5 was first presented to the public by the Convenor of Ecma TC32-TG22 at International Multimedia Telecommunications Consortium in Porto, Portugal on October 8, 2013.
Its world premiere was an ECMA-407 satellite test carrier, which was established in cooperation with France Télévisions, SES, Ecma International and other partners for the "Future Zone" of IBC in Amsterdam in September 2014. ECMA-407 attracted more than 1'500 broadcasting experts and enjoyed international news coverage (e.g. FKT, Studio Magazin).
References
- Par, C. Two Undiscovered Treasures for Ground-breaking 3D Audio Coding Technology at Lowest Bitrates: Inverse Problems and Invariant Theory. Proceedings of 27th VDT International Convention, 11/2012. ISBN 978-3-9812830-3-7.
- Ambartsumian, R. V. (1998). A Life in Astrophysics: Selected Papers of Viktor Ambartsumian. Allerton Press.
- Hilbert, D. (1893). Über die vollen Invariantensysteme. Mathematische Annalen, Bd. 42.
- Grace, J. H.; Young, A. (1903). The Algebra of Invariants. Cambridge University Press.
- Ahmad, J. J.; Alberti, C.; Hong, J.; Leonard, B.; Mattavelli, M.; Par, C.; Quackenbush, S.; Woszczyk, W. "ECMA-407: New Approaches to 3D Audio Content Data Rate Reduction with RVC-CAL". AES Paper 9218.
- Ahmad, J. J.; Hong, J.; Leonard, B.; Mattavelli, M.; Par, C.; Quackenbush, S.; Woszyzk, W. ECMA-407: A New 3D Audio Codec Implementation up to NHK 22.2 with RVC-CAL. Proceedings of 28th VDT International Convention, 11/2014.
External links
- Ecma TC32-TG22 (in English)
- 20th Anniversary Forum (in English)
- France Télévisions (in French)
- (in German)
- (in German)
- (in German)
Further reading
- Clemens Par. Stereo- und Surroundsound mit nur einem Kanal. Elektronik Informationen, 9/2010.
- Clemens Par. Dreidimensionaler Klang aus einem Kanal - die sanfte Revolution der Audiocodierung. FKT, 12/2010.
- Clemens Par. Revolutionary 3D Filter for Audio, Satellites, Cars, Medical Devices & Industrial Measurement. Electronics World, 1/2011.
- (in German)
Reviews
- "Erste professionelle 1-Kanallösung für Stereo, Surround, Dolby- und DTS-Formate und SDDS", Elektronik Industrie, 2/2009 and 6/2009.
- "Clemens Par". CE Markt, 4/2009.
- "Gorodissky-Preis für einkanaliges Stereo-System", Aktuelle Technik, 8/2009.
- K. Koch. Der Erfinder. Finanz und Wirtschaft, 9/2009.
- "Erfinderpreis für einkanalige Stereolösung", Mechatronik, 9/2009.
- "Clemens Par erhält internationale Auszeichnung", Swiss IT Reseller, 9/2009.
- "Ein-Kanal-Stereo", Elektrohändler, 10/2009.
- S. Buss. Générer des signaux stéréó à partir d'un signal mono. La Revue Polytechnique, 4/2010.
- "Neue Perspektiven". Studio Magazin, 9/2010.
- "Immersiver 3D-Sound auf mobilen Endgeräten". FKT, 10/2012.
- K. Koch. Raumklang. Finanz und Wirtschaft, 10/2012.
- "Weiterentwickelte 3D-Audiotechnologien". FKT, 2/2013.