Finding more space: Sennheiser on its journey with Netflix to expand two channel audio to bring in an immersive feel

At NAB 2023 Sennheiser introduced its Ambeo 2-Channel Spatial Audio renderer for live broadcasting. Already established in the market as a file-based solution for over the top (OTT) streaming, the company is looking for sports broadcast partners to help bring more immersive space to live sports.

Head of Sennheiser pro labs, Renato Pellegrini, tells us about its development with Netflix, why it is so important to partner directly with sports broadcasters, and how he hopes it will encourage every broadcaster to make immersive audio its default output format.


We talk a lot about immersive audio for live sports. We talk about how sound designers are creating experiences to build an emotional connection, and how broadcasters are experimenting with different formats and techniques to manage the sound. We talk about how efficient metadata is at keeping us organised, and how artificial intelligence (AI) is helping to automate processes to reduce the workload on the overworked mix engineer.

We have even discussed what immersive actually means on a philosophical level. As an industry we’re all over it.

Find out what the experts are saying on immersive audio with SVG Europe Audio’s Look Back at Qatar here

But outside the industry, that’s not necessarily the case. Despite the popularity of 3D soundbars and the built-in accessibility of Dolby Atmos in Sky Glass, not everyone is reaping the benefit. Most people don’t have a home cinema system, and even if they do, half the time they might be watching the game on their phone, or their laptop, or on a tablet. In real life, most people are still listening to programme output on two channels.

But what if there was a way to provide a more immersive experience from every single 3D audio mix, irrespective of the device you are using? Over the last few years Sennheiser’s Pro Labs have been looking for ways to do exactly this, to extract more value from immersive mixes and deliver them to a wider range of users.

Netflix steps up

Sennheiser’s Ambeo 2-Channel Spatial Audio renderer was launched in June 2022, when Netflix was the first OTT provider to go live with it in the cloud.

The renderer creates an enhanced two-channel mix from an immersive signal that works across headphones, TVs, and other stereo environments. Netflix now provides more than 700 titles which use the technology; Premium Plan subscribers can hear them by searching for, “spatial audio”.

The development attracted attention from sports broadcasters looking to enhance their offering, and following feedback at IBC 2022, Sennheiser was quick to develop a stream-based version. The Ambeo 2-Channel Spatial Audio renderer for live production was launched at NAB 2023.

Pellegrini says: “It is more expensive to produce an immersive mix and if we can only validate the additional cost by the number of immersive listeners it is a more difficult business case. While many high-profile sporting events are already produced in immersive formats, a spatialised renderer like AMBEO can encourage broadcasters to invest in immersive mixes at other events because it also improves the listening experience for the majority of viewers listening in stereo.

“If we can help justify that cost by adding value for the entire audience it becomes an easier argument; we think that technologies like the Ambeo 2-Channel Spatial Audio ups the game for all viewers.”

Not like other renderers

Of course, there are a number of spatialised renderers on the market, buoyed by the widespread adoption of binauralised audio from big brands like Apple. And nor is it a new technology; binaural audio has been around as an immersive two-channel format since 1881.

But according to Pellegrini, that is not what this is. Firstly, unlike many other renderers, Ambeo works in real time, which Pellegrini admits was quite a challenge: “In fact, it was our biggest challenge. The software needs to create and QA two independent mixes to be on air within a couple of picture frames; in Europe the allowed total end-to-end latency from microphone input to mix output must be below 80ms. Such short time frames can in practice only be met using automation so we set out to design a tool which enables mixers to concentrate solely on the immersive mix and trust the system to create the spatialised stereo mix automatically.”

Secondly, the way it spatialises the mix is different to other renderers. It is not stereo and it is not binaural. It is something else, he says. “Sennheiser is well known for headphones, so in the early development stages we asked Netflix if they wanted something that would sound great on headphones,” says Pellegrini. “They replied that because most of their end users listen to content on a TV, they would be more interested if we could make the TV sound output more immersive. But that they also wanted improvements to the headphone output as well!

“It impacted on development in a positive way. We already knew how to approach a binaural headphone mix so our goal became how to develop something which sounded better on all stereo devices. We worked directly alongside Netflix for 18 months, as well as with producers and end users to make sure the benefits were there for everyone.”

Using a combination of transaural, binaural and some other (confidential) technologies, Sennheiser says the renderer provides a more immersive experience whatever you are listening on.

Automating the system

“We had extensive discussions with production companies to ensure their recording engineers were happy with the output and we learned that different production companies worked with very similar settings to each other, and independent of genre. A documentary and an action movie would both use very similar settings,” adds Pellegrini.

“It meant that we could automate the process for the majority of use cases and most content providers were happy with the automated mix. It uses the original metadata which is stored in the ADM file to identify the position and level of the sound beds and objects so we know where it needs to be at any given point in time, and the renderer uses the information to find the best possible immersion in a two-channel set up.”

Pellegrini’s team also developed tools which enable users to fine tune the output, and the timing of Sennheiser’s acquisition of Merging Technologies in July 2022 provided an opportunity to complement that development.

Merging Anubis

Working under the umbrella of the Sennheiser group, Merging Technologies already had established hardware which could provide the tools for the job. The acquisition enabled Sennheiser to develop a prototype plugin for Merging’s Anubis interface that included AMBEO 2-Channel Spatial Audio live audio processing, which was launched at this year’s NAB.

Pellegrini: “Utilising the Anubis was an obvious solution as it is already established in OB vans and in studios as a monitoring control unit. It can take a 5.1.4 input and produce a stereo stream almost immediately. We did some work on the renderer to speed up the process, and we achieved a latency through the renderer of 16 samples at 48kHz.

“The Anubis UI also provides a way to finetune the two-channel mix both on the Anubis remote control software as well as directly on the Anbubis hardware. The controls provide a simplified overview of the 5.1.4 stems to enable the user to not only define the level of each leg (LR, C, LFE, LsRs and height, as well as sides if applicable), but also to change the level of contribution of each of these mixes.

“The left section of the panel allows the operator to isolate individual legs of the immersive mix for fine tuning. Pre-listens are a button press away, as well as A/B listening between the Ambeo mix and a standard Dolby Atmos mix.

“For example, a user can isolate the commentary on the centre speaker as independent while the rest of the presentation can be wider. In fact, we’ve found that this can help speech intelligibility; the cocktail party effect of our brain allows us to better focus on the commentary if the rest of the audio is in a wider soundfield. The more we open it up, the easier it is to focus on the commentary.”

Partner up

With the groundwork in place, Sennheiser is now looking to partner with sports broadcasters and other content producers to integrate the renderer into their existing workflows.

Pellegrini notes: “One of the reasons we announced this product as available for field testing was because we want to make sure that the workflows we are proposing are a good fit with the sports broadcasting community. For this reason, we are actively looking to work with third parties across both OTT and OTA models.

“If operators can focus on crafting a full immersive mix while Ambeo improves the experience for everyone else it can help broadcasters justify doing more productions in immersive audio.

“Automating some of the technicalities, like the creation of multiple output formats and ensuring sample accuracy, enables mixers to concentrate on creativity. And if we can help streamline that while still allowing broadcasters to preserve the original intent of the mix, that’s where we think we can add value.”

 

Subscribe and Get SVG Europe Newsletters