Sound in digital cinema

Sound in digital cinema could be divided in two parts :

Main Sound : Sound Essence

The Main Sound was introduced in DCI specifications and SMPTE standards since the beginning.

Under the CPL tag named MainSound, you have an MXF that includes up to 16 channels of data 1 with the possibility of dispatching.

You always encounter this type of Sound MXF alongside an Picture MXF.

Summary of technical specifications

Encoding PCM WAVE/RIFF 2 uncompressed and interlaced channels
Sample Bitdepth 24 bits / sample
Sample Rate 48 kHz or 96 kHz
Maximum Channels 16 channels
Minimal Bandwidth 3 19 Mb/s (48 kHz)
37 Mb/s (96 kHz)

You will find this informations about the CPL MainSound and MXF MainSound in their respective chapters.

Immersive Sound : Object-Based Audio Essence (OBAE)

The Immersive Sound was introduced in the early 2010s 4, under the name Object-Based Audio Essence (OBAE) and is based on the SMPTE standards Immersive Audio Bitstream (IAB).

Under the CPL tag named AuxData, you have an MXF that includes additional sounds elements based on the standard Immersive Audio Bitstream (IAB). It's from the IAB standards that we have Dolby Atmos, DTS-X, and Barco-Auro.

You will sometimes encounter this type of IAB MXF in addition to MXF Picture and MXF Sound.

The OBAE (or IAB) is a special type of sound asset because it is based on sound objects with a bunch of metadata, including spatialisation and temporal information.

OBAE ? IAB ? Enhanced Audio (EA) ?

It must be complicated to distinguish between Object-Based Audio Essence (OBAE), Immersive Audio Bitstream (IAB) and Enhanced Audio (EA):

See all of these as synonyms, depending on who's speaking. Preferably, use the terms IAB or OBAE.

Notice that we are mainly referring to the Immersive Audio Bitstream (IAB) standards. So don't be surprised to find few references to the acronym OBAE, and even fewer to Enhanced Audio (EA).

Summary of technical specifications

- Technical capacity D-Cinema Constraints
Encoding PCM WAVE ou DLC Encoding Only DLC
Sample Bitdepth 24 bits by sample
Sample Rate 48 kHz or 96 kHz Only 48 kHz
Maximum channels 128 channels

You will find all information about Immersive Audio in the respective chapters, including Dolby Atmos chapter and its MXF Immersive Audio, as well as Barco Auro and DTS-X which are also "Immersive Audio".

Chapters related to sound

Notes


  1. Of course, I'm talking about data because while the main channels carry useful audio assets for the viewer, there're also technical channels used for synchronization - or data - used by external devices, such as Dolby Atmos or even DBOX

  2. The DCI specifications and SMPTE standards talk a lot of about BWF (Broadcast Wave Format) which is a kind of evolution of the WAVE format with more metadata. But these metadata aren't used for the audio asset. See the MXF Sound Essence chapter and especially the notes in footer for more information about the WAVE/BWF format. 

  3. Few examples of differents calculations of the audio bandwith : 

    • (Sound Essence)  48 kHz and 16 channels : 16 channels * 24 bits * 48 kHz = 19 Mb/s
    • (Sound Essence)  96 kHz and 16 channels : 16 channels * 24 bits * 96 kHz = 37 Mb/s
    • (Immersive Audio) 48 kHz and 48 channels : 48 channels * 24 bits * 48 kHz = 55.3 Mb/s
    • (Immersive Audio) 96 kHz and 48 channels : 48 channels * 24 bits * 96 kHz = 110.6 Mb/s
  4. Integrated into the DCI specifications in 2013.