A Guide For Those Designing Sounds For Immersive Audio

Immersive audio formats such as Dolby Atmos have changed expectations around scale, movement and realism in modern sound design. Height speakers, object-based panning and increasingly complex playback systems offer new creative possibilities, but they also expose a gap between what sound designers expect immersive formats to deliver and what listeners actually perceive.

To understand that gap, it helps to step away from tools and workflows and look at how human hearing actually works.

Few people have explored that territory in more depth than Hyunkook Lee, Professor of Psychoacoustics at the University of Huddersfield and founder of the Applied Psychoacoustics Lab, APL. Hyunkook has spent years running controlled listening experiments on vertical localisation, spectral perception and immersive playback systems, often with results that challenge long-held assumptions in professional audio.

In a recent discussion with BOOM Library, we focused on sound effects work for immersive formats, he shared a number of findings that have direct consequences for how sounds are designed, layered and routed in Atmos and other 3D environments.

Why Height Speakers Often Disappoint

One of the most common mistakes when moving from 5.1 or 7.1 into immersive formats is assuming that height speakers automatically create convincing vertical placement.

“There is all that expectation about the difference of the height speakers,” Hyunkook explains. “But the way we localise sounds vertically is mainly about spectral cues.”

Unlike left–right localisation, which relies heavily on timing and level differences between the ears, vertical localisation depends on how frequency content is shaped by the head, shoulders and outer ear. This leads to a fundamental limitation.

“Higher frequencies tend to be localised higher, lower frequencies localised lower,” he says. “Even if the physical position of the sound is high, if it’s something like 100 Hz, we only localise it at ear height or below.”

For sound designers, this means that routing low-frequency-dominant material to height speakers often has little audible benefit. A jet, distant thunder or heavy machinery may feel like it belongs overhead, but unless there is sufficient high-frequency content, the brain does not place it there.

“When the content is predominantly low-frequency dominant,” Hyunkook explains, “the height speaker channels contribute very little in terms of vertical localisation.”

Broadband Energy Matters More Than Position

This does not mean that large environmental sounds cannot feel elevated. The distinction lies in the bandwidth of the signal.

“There’s a difference between narrow-band sounds and broadband sounds,” Hyunkook explains. “Jet noise or helicopters can actually be perceived as elevated because there are elements like transient energy and harmonics.”

In experiments using pink noise played through height speakers, listeners perceived strong vertical spread because the signal contained energy across the spectrum. But once that same noise was low-passed, the illusion collapsed.

“If you low pass it at around 1kHz, there is no way you will perceive it as coming from the actual height speaker.”

For sound designers, the implication is simple but easy to overlook. Static rumbles, drones and low layers rarely benefit from height placement. Transients, crackling textures and broadband elements are far more effective.

“Anything that has some sparkles or crackling and transient energies,” Hyunkook says, “ideally broadband sounds with enough high-frequency energy.”

More Speakers Do Not Guarantee Better Immersion

One of the more sobering conclusions from Hyunkook’s research is that increasing channel counts does not automatically improve perceived immersion.

“I’ve compared 22.2, 9.1, 5.1 to stereo in terms of vertical spread and envelopment,” he says. “Often 24 speakers don’t sound any better than nine speakers.”

In controlled listening tests, downmixed versions frequently produced statistically indistinguishable results. A major reason lies in how reverb and ambience are typically handled.

“A lot of people put reverb into the height speakers,” Hyunkook explains, “but the reverb doesn’t really have much height cue. It’s mainly rolled off above 4 kHz.”

Without meaningful high-frequency content, additional channels add complexity rather than clarity.

Using Frequency as a Spatial Tool

One of the most practically useful ideas to emerge from Hyunkook’s work is perceptual band allocation, or PBA. Rather than treating spatial placement as purely geometric, PBA treats frequency as a spatial parameter.

“PBA allocates different frequency bands to either lower or upper speakers, depending on their unique perceived positions,” he says.

In its simplest form, this involves splitting a signal into two frequency bands and routing them differently, for example sending frequencies above 1 kHz to height speakers and lower frequencies to ear-level speakers.

“I applied this approach to upmixing 2D reverb into 3D,” Hyunkook explains. “That sounded much better than the original 3D reverb.”

Because different frequency bands arrive from different physical positions, the ear integrates them naturally, without the phase smearing and cancellations that often occur when the same signal is reproduced from multiple locations.

“You don’t have any smearing at the ear,” he says. “Different frequencies arriving from different positions sum together naturally.”

Why ORTF-3D Recordings Translate Well to Immersive Formats

Hyunkook’s research also helps explain why certain field-recorded sound libraries behave more convincingly in immersive mixes than others.

If height perception relies on real spectral and timing cues, then capturing those cues at the recording stage matters. Recording approaches such as ORTF-3D preserve level differences, time differences and vertical information in a way that aligns well with how the ear interprets space.

Rather than constructing height artificially later, these recordings already contain usable spatial information that holds together when placed into immersive beds.

This is particularly relevant for environmental libraries such as SWAMPS, where ORTF-3D recording captures dense insect beds, wildlife and atmospheric movement with stable localisation and phase coherence. The result is ambience material that sits naturally in immersive formats without heavy processing or corrective EQ.

The same applies to THUNDERSTORM CHASER, which documents real storms using a Schoeps ORTF-3D microphone array. The combination of broadband energy, transient lightning cracks and naturally evolving spatial cues makes these recordings especially effective in Atmos mixes, whether used subtly for envelopment or more dramatically across the height layer.

In both cases, the recordings align closely with Hyunkook’s findings. They provide the spectral and temporal information the ear needs, rather than relying on speaker placement alone to create the illusion of height.

Phantom Images and the Illusion of Height

Another counter-intuitive finding from Hyunkook’s work is the effectiveness of phantom images, particularly in the upper plane.

“A phantom image is more effective for elevation perception than a real image in the upper plane,” he says.

Sending the same signal to multiple height speakers to create a phantom centre can result in stronger perceived elevation than placing a sound in a single overhead speaker.

“As soon as you make it a phantom centre, you now have natural elevation,” Hyunkook explains. “It can be perceived as higher than the physical height of the speaker.”

Notably, this effect does not require large amounts of high-frequency energy.

“Even a 500 Hz octave band can be elevated quite high,” he says, “with a phantom image.”

For sound designers working with mono effects or legacy material, this suggests that distribution across speakers can often be more effective than precise object placement.

A Practical Takeaway for Sound Designers

Asked for a single takeaway for sound designers working in immersive formats, Hyunkook comes back to frequency rather than format.

“If you want to make the height speakers more effective,” he says, “high frequencies to the height speakers, low frequencies to the main speakers.”

Because of head-related transfer functions (HRTFs), different speaker positions effectively behave like natural equalisers.

“You can use these speakers almost like a natural EQ,” Hyunkook explains. “If you want to make a sound brighter, you can simply route it to the height speakers as they have more high frequency energies compared to the main speakers in HRTF.”

In immersive sound design, localisation is often fragile and ambiguous. Tonal balance is not.

“It’s not just about localisation,” he says. “It’s about distributing your signals to positions that give you the most natural balance.”

For sound designers creating immersive sound effects, the message is clear. Immersion does not come from more speakers or more complexity, but from understanding how the ear responds to frequency, space and movement, and choosing or designing source material that works with those perceptual limits rather than against them.

Learn More About Immersive Audio

If you want to find out more about his research, here are some selected papers on this subject.

Lee, H. “Perceptual Band Allocation (PBA) for the Rendering of Vertical Image Spread with a Vertical 2D Loudspeaker Array,” Journal of the Audio Engineering Society, 64 (12), pp. 1003-1013, 2016.

Lee, H. “2D to 3D ambience upmixing based on perceptual band allocation,” Journal of the Audio Engineering Society, 63 (10), pp. 811-821, 2015.

Lee, H. and Gribben, C. “Effect of Vertical Microphone Layer Spacing for a 3D Microphone Array,” Journal of the Audio Engineering Society, 62 (12), pp. 870-884, 2014.

Lee, H. “Multichannel 3D Microphone Array: A Review,” Journal of the Audio Engineering Society. 69, 1/2, p. 5-26, 2021.

Lee, H. “Sound Source and Loudspeaker Base Angle Dependency of the Phantom Image Elevation Effect,” Journal of the Audio Engineering Society, 65 (9), pp. 733-748, 2017.

Eaton, C. and Lee, H., “Subjective Evaluations of Three-Dimensional, Surround and Stereo Loudspeaker Reproductions using Classical Music Recordings,” Acoustical Science and Technology. 43, 2, p. 149-161, 2022.

To find out more about Applied Psychoacoustics Laboratory (APL) and the different tools created to help those mixing in immersive formats, visit the APL website https://apl-hud.com/

Boom library