Why we don’t use Ambisonic Microphones - even for VR

There is an unbroken hype about Ambisonic microphones ever since VR became mainstream. I think there are a bunch of misunderstandings and additionally the possibility to record a rather flexible format for a relatively low price is surely tempting. So, after a lot of experimenting and a lot of gear shootouts, here they are, my two cents and solution on surround / ambient recordings for VR and why Ambisonic microphones and other coincident surround setups aren’t one of them. Mono sources are a very important but different topic for VR.

Let’s start with a few basic things which still are misunderstood way too often.

Binaural

Binaural recordings are great for several reasons, but using them for VR applications in which a user can turn their head and still needs to be able to locate sounds, even (or especially) without seeing it, a binaural recording is not the best solution. They provide an excellent virtual surround picture when listened to through headphones, but only from one perspective / direction. When binaural recordings are rotated afterwards, the whole idea of binaural recordings and the resulting image gets destroyed. Binaural for VR is only the delivery format. It is the result that gets calculated from one or several audio sources to create a believable and engaging acoustical 3D environment on only two channels, the headphone’s left and right speakers. In short, as most have come to realize: No, binaural recordings are not the best recording format for VR, 360° videos or any other interactive 3D media.

Ambisonic

Pretty much the same misunderstanding is widespread when it comes to Ambisonic. Ambisonic is a system developed in the 60s / 70s. Today it is mostly a delivery format to be able to decode a specific location of a sound into a wide range of speaker setups, one being binaural for headphones. That means there is no need to have native Ambisonic recordings to deliver audio in Ambisonic format for VR or 360° videos to create a 3D environment. Not at all! The big remaining question then is: Do you really choose any Ambisonic microphone because of how it sounds, because it does sound better than other surround recording techniques?

The Rise of Ambisonic microphones

What is actually the big deal about all these Ambisonic microphones? There are several benefits. One is, it is a relatively cost-effective (not to say cheap) solution to record surround sound compared to a bunch of other possible setups. The other one is that it is a relatively small setup, with the possibility to mount easily to cameras, even with a windshield. The third might be the channel count: Even with only four channels it is theoretically possible to create surround images for setups with a much higher speaker count. This leads to flexibility: Using an Ambisonic decoder, basically all possible microphone characteristics can be calculated – at least mono, or coincident stereo and surround setups. Last but not least, the misunderstanding of Ambisonic (see above) might be a reason for some to use this setup.

What we need from recordings for surround, Atmos and VR

What are we actually after when recording, and what are we after when doing sound for VR? In my experience, we are always after the best sounding solution. When recording for VR a surround recording also needs to be perfectly rotatable without losing the locatability of the sound source. We want to create immersive, deep and emotionally engaging acoustic environments. We want a listener to be sucked into the acoustic world, we want the listener to believe the world he is hearing and not being distracted by it.

Current microphone techniques

So, which options in general do we have to create surround recordings? That isn’t the easiest thing to answer, especially when such recordings should still be compatible with VR applications. The reason is that

most surround recording techniques have been developed for surround speaker systems (5.1, 7.1). These systems are not really rotatable without audible artefacts and
VR is more than only surround, it also does have height information; something you also find in Atmos or Auro3D – except Atmos and Auro3D do not have the need to have rotatable sound.

But let’s focus on the surround image first.

There are basically two main categories of multichannel recording techniques: coincident microphones and spaced microphones.

Coincident microphone setups for example for stereo are XY and MS, for surround they are Double-MS, Blumlein and Tetrahedron (aka Ambisonic microphones), B-Format or Triple-MS for 3D . Spaced microphones for stereo are for example ORTF, NOS or spaced A/B (+ several others) and for surround this would be all the several MMAD systems out there (OCT-5/9, Omni arrays).

Coincident microphones create stereo or surround images through direction only; there is no time difference between the capsules (ideally). Spaced microphones (when only omni microphones are used which is not a must) create imagery through time difference. Our ears do both.

What are the pros and cons for both types of setups? This list obviously is not complete and also depends on what kind of coincident or spaced setup and specific microphone this is about. This is just a general guideline.

Coincident Surround Pros:

Potentially small footprint, most likely fits easily into windshields.
Easy to set up
Works well on close sound sources

Coincident Surround Cons:

Small sound image
Very difficult locatability of sound source
Diffuse phantom sound source

Spaced Surround Pros:

Wide and pleasing surround image
Very good translation of depth / distance of sound source
Very good locatability

Spaced Surround Cons:

Mostly large setups, often difficult for field recordings
Potentially less mono compatible

Since Dolby Atmos and VR is an important thing in the media business, there are a few solutions for height information for spaced microphone setups, but only Ambisonic and triple MS for coincident microphone setups.

When would I use spaced and when Ambisonic microphones?

Most spaced microphones are rather large and have a lot more channels to handle compared to Ambisonic microphones. Even though to my mind the sonic quality of spaced surround microphone setups massively exceeds the listening experience of coincident microphones, in some situations it is simply impossible to use a spaced setup. For example in some cars, in airplane cockpits, in helmets and other tight rooms. But the only reason I personally would use an Ambisonic is because I think it might be better having a surround and VR compatible recording in these scenarios compared to having only stereo or even mono. Especially with Ambiences I am trying to avoid Ambisonic microphones as much as possible.

The End

Whenever I was asked in the recent four or five years, why we at Boom Library wouldn’t record anything for VR, the answer was simple: The quality, character, emotional depth of recordings with Ambisonic microphones is not at all what I am personally striving for. Since Schoeps came up with the ORTF3D microphone setup, which perfectly combines imagery through direction and time difference at a relatively small size and inside only one windshield, I think I found a solution which offers all I wanted. And I am sure other solutions like it by other manufacturers will follow.

Audio Examples

Of course, all this means nothing without the possibility to listen to some examples. I would like to offer three different ways to compare the coincident Ambisonic microphone Sennheiser AmbeoVR (I am not judging this microphone compared to other Ambisonic microphones here!) to the spaced (aka near coincident) Schoeps ORTF3D:

The original recordings – raw, unencoded
A Quad surround encoded file for the Ambisonic microphone
A binaural mixdown for both, Ambisonic and ORTF3D
A binaural mixdown of one longer recording that gets rotated by 360° to the right over the length of the recording

Several things that are important to me: How is the overall imagery? How well can I locate and follow sound sources within a recording? How well can I follow a sound source when it gets rotated? Is the recording inspiring and believable; do I need time to adapt to the scenario or do I instantly know where any of the surrounding sound sources is located? How does it translate from headphones over 5.1 surround in smaller rooms to big Dolby Atmos theater rooms?

If you want to try the original ORTF3D sounds in VR, simply place the channels in a virtual cube around you.

Download Audio Sample

ORTF3D vs. Binaural - use Headphones!

The Ambisonic has been converted to Quad using the SurroundZone2 plug-in from Soundfield. Both the ORTF3D and Ambisonic microphones were then converted to binaural using the Mach1 Monitor (www.mach1.tech).

You can find all original unencoded files in the “Download Audio Sample” pack from above. You can experiment with any other binaural converter or setup to get your own experience with the comparison.