Tanssion > blog > Audio Products > New Direction of Digital Audio Products--Spatial Audio

New Direction of Digital Audio Products--Spatial Audio

Author: Tanssion Date: 2023-08-23 Hits: 0

Ⅰ. What is spatial audio?
Ⅱ. The development of spatial audio
Ⅲ. How does spatial audio work?
Ⅳ. Spatial audio recording method
Ⅴ. Spatial audio collection
Ⅵ. Spatial audio music platform
Ⅶ. Static spatial audio VS dynamic spatial audio
Ⅷ. Spatial audio processing method

Ⅰ. What is spatial audio?

Often referred to as 3D audio, spatial audio allows users to fully immerse themselves in a virtual three-dimensional space. Most smartphones can play spatial audio. But if we want to fully use spatial audio technology in music, movies, games or other types of content, we need to produce more spatial audio content and adopt more new technologies. For example, in the headphone/true wireless product category, spatial audio technology with head tracking can provide a fully immersive surround sound experience.

Ⅱ. The development of spatial audio

The development of audio has gone through the development process from monophonic to stereophonic, to multi-channel surround sound, and finally to spatial audio. Mono refers to the process of using only one microphone to pick up the sound and one speaker to play it back. It records audio signals from different directions and plays them through a speaker. In this way, the audience can only feel the timbre, pitch, loudness and front and rear position of the audio, but cannot feel the lateral movement of the sound. Stereo is composed of two channels with a phase difference. Compared with mono audio, it can feel the orientation and level of sound, and has the effect of spatial stereoscopic effect. Common stereo coding techniques include parametric stereo (Parametric Stereo, PS), intensity stereo (IntensityStereo, Is), <Left/Right, L/R), (Mid/Side, M/S), joint stereo (Joint Stereo, JS )wait. Multichannel surround sound is an audio signal with multiple channels. Compared with stereo, it has better spatial effects.

Ⅲ. How does spatial audio work?

Before spatial audio, surround sound was encoded so that sound could be assigned to specific speakers. Sound usually comes from the center speaker, and background music and effects usually come from the rear speakers. Instead of assigning sounds to specific speakers, spatial audio places them in a space. For example, effects can be positioned above and to the right of center. Based on the number of speakers and the speaker layout, your system figures out how best to make the sound appear to be coming from that location.

Spatial audio also adds height, making sound domes possible. You can hear the helicopter lift off and fly over your head, or feel the bullets whizzing past your ears.

Applying spatial audio to music is similar, but the impact on your experience is different. When a song is mixed into spatial audio, the music can actually surround you. You feel like you are in the center of the band. The vocals might be in front of you, the guitars might be from your right, and the harmonies might be from behind you. This can be a fun experience or completely disorienting. When you get used to a stereo mix of your favorite songs, a spatial audio mix can bring out something you've never heard before, or it can make you wish you heard it the way you've always had and loved it.

Ⅳ. Spatial audio recording method

1. Single point sound source recording

We can use a single-point mono microphone to record, then pan in the corresponding sound field in a traditional way, and then complete it through algorithmic technology, and reproduce it into a full 3D speaker setup.

2. Microphone array

We use a multi-channel microphone array to place multiple mono microphones to record sound sources and pan them through post-production software to place the signals in a three-dimensional scene.

3. Dummy head

The dummy head binaural recording technology uses two omni-directional microphones placed in the dummy head's ears to simulate human perception of sound and will provide the recording with important auditory information about sound source distance, sense of space, timbre and direction , reflecting the sound received by the listener's ears in the real environment.

4. Ambisonic microphones

Ambisonics is a multi-channel technology that allows you to spherically capture sound from all directions within a scene at a point in space. This can be achieved with a dedicated microphone with an Ambisonics model. These are uniquely designed microphones that contain four cardioid-shaped microphones pointing in different directions. This particular arrangement is called a tetrahedral array. The microphone signal is in Ambisonics A format and needs to be converted to B format for post-processing.

A format is the raw audio from an ambisonics microphone. Each microphone head is an audio channel. B-format is a standard multi-channel audio format for ambisonics audio. Ambisonics microphones of different models must have their native A-format recordings converted to standard B-format for post-production and compatibility. These days, major post-production and playback tools on the market support ambisonics, making ambisonics a suitable tool for virtual reality and production programs involving 3D spatial sound.

Ⅴ. Spatial audio collection

1. In-ear microphone and artificial head

Obviously, if we want to fully preserve the sound heard by the human ear, we can use an in-ear microphone to directly record the audio received by the left and right ear canals. Or we can use the artificial head method to build the human head, auricle, ear canal and other parts through a bionic model, and then collect spatial audio through the built-in microphone in the artificial ear on the artificial head.

The difference between in-ear microphones and artificial head acquisition is actually obvious. If you use the audio collected by the in-ear microphone and then play it with the in-ear headphones, it can basically be perfectly restored. And if it is recorded with an artificial head, the shape of the pinna and the shape of the head are different from your own. So although it is possible to achieve a large degree of space restoration, it is still somewhat different from actually going to the scene to listen to it yourself. In actual use, everyone's ears and head shapes are different, but the general shape and position are the same. Therefore, the use of artificial heads for audio recording is often used in many film and television and game audio productions.

2. Quad binaural

We can simply understand Quad Binaural as a 4-way dummy head microphone. We can use it to obtain sound fields with HRFT information in four horizontal directions of 0°, 90°, 180°, and 270°. Of course, if the sound comes from an angle other than these four directions, such as 120°, we can use the two sets of data of 90° and 180° to do the algorithm. The microphones using Quad Binaural technology are mainly 3Dio's Omini.

The advantage of Quad Binaural is that it collects natural HRTF information, so the later algorithm and decoding are very simple. And it pans out horizontally better than the usual low end Ambsonic way. But the disadvantage of this method is also obvious. Since the HRTFs in the four directions are all on the same horizontal plane, the height information cannot be fed back according to the head rotation. That is to say, when you shake your head left and right, the sound will change according to your direction. And when you look up or down, the sound doesn't change.

3. Ambisonics

The audio collected by the artificial head or the in-ear microphone is only a fixed-direction stereo restoration, and can only restore the sound that the head is facing at the time of collection. If you want to record the sound field of the entire space, you can turn your head to listen to the sound in any direction during playback, then you need another set of technology called high-fidelity stereo image reproduction (Ambisonics).

High-fidelity stereo image reproduction originated from a research on three-dimensional sound field reconstruction technology at Oxford University in the 1970s. The core of the technology is to reproduce the sound heard in the far end by recording it through a special microphone, such as a first-order ambisonics microphone (a cubic array of four identical microphone units).

Here the raw data collected by the first-order ambisonic microphone is called A-format. Four heart-shaped diaphragms point to four directions: left front LF, left rear LB, right front RF, and right rear RB. It cannot be played directly. We need to first convert to 4-channel B-format according to the multi-channel transcoding format. The 4-channel B-format is also called the first-order B-format. Four of these channels are called W, X, Y, and Z. To understand it simply, these four directions represent the center, left and right, front and rear, and up and down of a spherical sound field respectively. The B-format data can be rendered by software into any format supported by any playback device, such as stereo, 2.1, 5.1 or even 7.1.

Low-end ambisonics microphones can reproduce a relatively small sound field. And if it is an airport, large-scale concert and other scenes, we may need a high-end ambisonic microphone. It can be seen that the higher the order, the more microphones are required. For example, the Audio Camera of VisiSonics uses 7th-order Ambisonic technology with 64 channels. Ambisonics technology can well restore the sense of hearing of the entire sound field in AR, VR and other scenes that need to rotate the perspective, so it is widely used.

Ⅵ. Spatial audio music platform

1. Apple Music

Apple Music is a streaming service that lets you listen to tens of millions of great songs. It has many wonderful features, including downloading songs and playing offline, displaying lyrics in real time, listening across devices, recommending new songs based on your preferences, and curated playlists from editors, etc. Plus, it has exclusive content and original programming to enjoy.

Dolby Atmos brings you spatial audio that surrounds you. Lossless audio lets you hear beautiful details clearly. Dolby Atmos is an innovative audio technology for an immersive listening experience. Stereo mixed music can only be presented through the left and right channels, but music recorded in Dolby Atmos breaks through the limitation of channels, allowing the sound effects to linger around. In addition, musicians can also adjust the volume, ratio and intensity of each instrument to interpret the various subtleties of the work.

Apple Music subscribers can listen to thousands of Dolby Atmos-enabled songs on any headphones, as long as they’re running the latest version of Apple Music on their iPhone, iPad and Mac. Music that supports Dolby Atmos automatically plays in this mode when you listen to it with compatible Apple or Beats-branded headphones. 

2. NetEase Cloud Music

Use the self-developed algorithm to separate the two-channel sound source, decompose different sound elements, and then use the sound space position transfer function to create an immersive space experience. This applies to all NetEase Cloud Music content. NetEase Immersive Sound separates the two-channel sound source through a self-developed algorithm. After decomposing different sound elements, the sound space position transfer function is used to create an immersive spatial experience, which is applicable to all content on this platform.

NetEase Cloud Music Mobile App has launched Dolby Atmos music service. The built-in Dolby Atmos zone will have rich music content resources. A new experience in music listening.

3. Huawei Music

Huawei Music launched the spatial audio experience zone. Enter the Huawei music space audio zone, and let's listen to the space audio versions of popular songs by Cai Jianya, Chen Linong, Chen Zitong TIFA, Gina Alice Gina, Sunnnee, Xu Wei, feel the all-round lingering of music, and experience the immersive space of sound. Audio Vivid redefined as "good sound".

4. QQ Music

QQ Music launched the Dolby Atmos music function, becoming the first domestic music platform to support Dolby Atmos. Super member users can now use Dolby Atmos-enabled Android phones to enjoy the immersive high-quality music experience of Dolby Atmos.

On July 6, 2022, Tencent Music Entertainment Group (TME) and Dolby Laboratories (Dolby), a leader in immersive entertainment, jointly announced the launch of the Dolby Atmos music function on QQ Music. QQ music platform became the first domestic music platform to support Dolby Atmos. Super member users can now enjoy an immersive high-quality music experience through Dolby Atmos. This marks the beginning of the strategic cooperation between Tencent Music Entertainment Group and Dolby. The two parties will further promote the popularization of Dolby Atmos music in China in the cooperation.

Dolby Atmos Music is a new way to create and experience music that maximizes artistic expression and creates a deeper connection between musicians and their fans. Music in Dolby Atmos goes above and beyond the ordinary listening experience, immersing you in the song, delivering rich detail with unparalleled clarity and depth. It gives musicians more creative space and freedom, allows them to fully realize their creative vision, and opens up a new realm of feeling music emotions for music fans. Whether listening to layers of instrumentation swirling around you, catching a singer's tiny breaths between lyrics, or feeling a melody drown you out, nothing brings you into the music like Dolby Atmos.

Ⅶ. Static spatial audio VS dynamic spatial audio

In traditional static spatial audio, when the head is turned, the content currently playing in the audio device will remain at the original position, that is, the audio on the right remains on the right, and the audio on the left remains on the left. This is because the spatialization effect is achieved without head tracking, which means that the spatial audio is locked to a fixed position. When the user listens to this type of spatial audio, there will be a feeling that the sound is near or far away.

With the blessing of head tracking, dynamic spatial audio can provide users with a more immersive experience. For example, if you turn your head to the right of the sound field, the sound field for the entire audio will rotate to the left by an equal amount, and the information going to each ear depends on the position of the head. As you move, it quickly fills in various parts of the soundstage around you. This means you can be immersed in a full 360-degree soundstage at all times. To reap the benefits of spatial audio, songs, games, movies and other media and programming must still support 5.1, 7.1 or Dolby Atmos formats. Only then can users experience static or dynamic spatial audio content.

Ⅷ. Spatial audio processing method

From the perspective of production technology, it can be divided into three schemes: object-based scheme, scene-based scheme and channel-based scheme. The three methods are briefly introduced below.

1. Based on object orientation

This approach can overcome the above-mentioned channel-based obstacles. We independently encode in which direction and how loud each sound object is, and let the sound replay try to position the audio where it needs to be. This allows the flexibility to adapt to specific factors such as the user's environment and platform. This format can reproduce audio from mono to a full 360-degree sphere.

2. Based on scene orientation

Capture complete scene information from the very center of the scene, most commonly using ambisonic technology. It is a full 360-degree sphere that can be captured from a single point with an ambisonic microphone or artificially created in post-production. Ambisonic comes in two different flavors: FOA (First Order) and HOA (Higher Order). FOA contains four channels - Omnidirectional, Left and Right, Front and Back, and Up and Down. HOA means more channels, more channels are technically equivalent to increased spatial resolution, and higher resolution means better localization.

3. Based on channel guidance

This is the most traditional and well-developed way of positioning, the production framework of which is linked to the format of reproduction. The various sound sources are mixed in Digital Audio Workshop and the final channel-based mix is created. It is usually used for a specific target loudspeaker layout, and each channel in the final product must be reproduced by loudspeakers at a well-defined position and delivered to the end user in a fixed audio mix. The mono, stereo, 5.1, and 7.1 we often hear are all of this type.


Frequently Asked Questions

1、What is the Spatial Audio?
Spatial audio with dynamic head tracking brings cinema-like sound from the film or video you're watching, so that sound feels like it's coming from all around you.
2、Is Spatial Audio only for AirPods?
In fact, any headphones will do. Apple Music will default to automatically play Dolby Atmos tracks on all AirPods and Beats headphones that feature the H1 or W1 chip. To get Spatial Audio with Dolby Atmos working on any pair of headphones, you just need to enable Atmos to 'always on' in the settings.
3、Why use spatial audio for music?
Spatial Audio enables you to hear three-dimensional audio from supported videos that follow the movement of your iPhone or iPad. It effectively recreates a cinema-style experience, where sounds appear to be coming from all around you - front, behind, from the side - even above your head.
4、Are Spatial Audio and 3D audio the same?
However, in the industry, spatial audio refers to a very specific type of experience. You may also hear it referred to as 3D audio or, in Samsung's case, 360 audio. The technical term for these types of experiences is head tracked binaural audio.
5、Is Spatial Audio better than Dolby?
In a nutshell, Dolby Atmos creates the effect of watching a movie at a cinema or listening to music at a live concert — the sound comes from all around you (center, left, right, above and behind) — and Spatial Audio adds another layer that makes you feel like you're in movie or moving around at the concert.

Leave a Comment

Related Articles

Popular Tags

PMIC Audio Products Logic Interface capacitors linear controllers embedded Line Protection drivers amplifiers Distribution Backups wireless modules memory converters Battery Products sensors filters relays Switches distribution analog Clock timing voltage diodes speakers Batteries Rechargeable battery regulators Fiber Optic Cables Cable Assemblies routers microcontroller Backups audio Magnetics - Transformer Inductor Components cables Electric Double Layer Capacitors (EDLC) Supercapa inductors transformer optoelectronics potentiometer resistors switching management special digital purpose signal Discrete Semiconductor Ceramic Capacitors semiconductor cable Alarms equipment resonators oscillators crystals kits accessories isolators motors RF Transformers monitors comparators specialized programmable microcontrollers FPGAs Data Acquisition application specific gates inverters Buffers Transceivers dividers Sensor decoders microprocessors microprocessor DC video circuit protection microphones PCB Integrated Circuits (ICs) PMIC - Lighting Memory Cards SSDs HDDs Wires Tantalum Capacitors Transducers LEDs Battery Chargers 4G Ballast Controllers Vacuum Tubes Transistors - Bipolar (BJT) - Single counter integrated circuits Guitar Parts Buzzer Elements transducers circuit Computer Equipment Piezo Benders boxes Magnetics enclosures racks Buzzers wires and Sirens wire Buzzers and Sirens inductor components connectors interconnects Embedded Computers fans thermal hardware fasteners coils chokes controls automation identification barriers signs labels protection inductor educational networking resistor powersupply power supply prototyping fabrication desoldering soldering ESD static Tapes adhesives materials Test measurement Tools Uncategorized Specialized ICs voltage Regulators contro thermal Management motor laser full half switchers batteries translators shift latches flip flops voice playback serializers deserializers active synthesis PLDs clocks delay lines reference supervisors PoE correction lighting ballast hot swap energy metering specialty parity generators checkers FIFOs multipliers instrumentation UARTs terminators capacitive touch Modems ICs Encoders DSP Data acquisition front end timers synthesizers frequency regulator controller regula RMS power OR ideal LED gate display chargers configuration proms universal bus functions multiplexers multivibrators counters processing amps telecom repeaters splitters detector interfaces I/O expanders receivers CODECs system SoC CPLDs Complex amplifier IF RFID Oscillator Externally excited oscillator fuses switchs transistors shunt thyristor Oscillators Resonators Ballast Controllers Coils Chokes RF Filters RF/IF and RFID RF Amplifiers Battery Packs SAW Filters Mica and PTFE Capacitors Accessories Piezo Benders sdsd ballasts starter SSD HDD Modules

Popular Posts