How real is real? Asking the deep questions at SVG Europe Audio’s Look back at Qatar

Kicking off the SVG Europe Audio Immersive Audio Forum – Look back at Qatar, [Top row] Heather McLean, SVG Europe Audio, head, Nicolas Brie, HBS, engineering manager [bottom row] Felix Krückels, sound designer, mixer, consultant & educator and Roger Charlesworth, industry consultant

SVG Europe is not afraid of asking the big questions, and at its first audio event of 2023 the big question was philosophical. In the first Audio Forum session of the year, we looked back on last Year’s FIFA World Cup in Qatar, what it took to get there, and what it takes to build a robust ST2110 network to get everything on air.

Year on year, Host Broadcast Services (HBS) has been adding more levels of immersion to its coverage of the FIFA World Cup, and more than 140 people signed up for SVG Europe’s Immersive Audio Forum – Look Back at Qatar to find out exactly how the sound of the biggest football competition in the world has developed over the last 20 years.

With special guests Felix Krückels, sound designer, mixer, consultant and educator, and Nicolas Brie, HBS, engineering manager, the session kicked off with a philosophical question; what does immersion actually mean?

Krückels has a good idea. He has been working in broadcast production since 1994 and has worked on many football World Cups as well as the Bundesliga in Germany. Currently the director of sound and music production for the Bachelor Programme at the Darmstadt University of Applied Sciences, he also spent over 14 years helping Lawo develop broadcast products for live broadcast.

After decades of exploring how to bring viewers closer to the action, Krückels asked what it means when we talk about immersion. He explained how immersion is a personal journey; it already happens in surround, it can take place in stereo and even in mono, and immersion means different things to different people.

“You can be immersed in water, or by the beautiful scenery in a forest surrounded by birds and trees, or by people you are with at a party or a rock concert,” said Krückels. “In a football stadium, you are immersed by the sound pressure; for some people the architecture can immerse them; it’s the scenery, with fans cheering alongside you; it’s the emotion of the opening ceremony.

“In a video game, immersion tries to provide something that doesn’t exist in reality; it puts you into scenery which you typically don’t have in real life. In audio, immersion is a buzzword for 3D audio but in football, immersion is not just the audio; it’s the 40-plus cameras we used for the world cup, it’s the storytelling, it’s the detail through graphics, commentary and replays, and it’s the audio which ties everything together.”

Krückels explained how the creation of immersive audio experiences goes beyond surround formats to encompass every aspect of audio that brings the viewer at home closer to the game and how it has to work in parallel with the rest of the presentation to create the illusion.

Illuminating presentation

Today the FIFA World Cup mic plan has 16 pitch mics, 20 crowd mics, and over 40 camera mics: more than 80 in total. Krückels’ illuminating presentation began with FIFA’s adoption of 5.1 surround in 2006. Since then, it’s been about building up layers to create new levels of immersion that build resonance with the atmosphere of the game and a stronger connection to the venue. A traditional approach to capturing the crowd led to the addition of audience spot mics in 2010, which were placed at 90° to the viewer for clearer localisation.

“This was an attempt to create more immersion by creating a more exaggerated impression of the stadium. We used stereo mics to keep the width and size consistent, but the signals from the left and the right stands are completely decorrelated.”

In 2013 FIFA completed its first tests with UHD video which included 3D audio. In the beginning, crowd noise was added in the height speakers, but according to Krückels, “it didn’t create more immersion”.

To create a better correlation between the main speakers and the height speakers the team added more mics for the left and the right sides in the height layer, working with Schoeps to create a prototype array. This was refined for the 2018 World Cup when Krückels discovered that our brains do not need the time delay between the main layer and the height layer for height perception; we just need the level difference. Schoeps designed a much more compact system.

HBS also worked with Lawo to introduce another technology for the 2018 competition.

Krückels notes: “In 2018 we also introduced the Kick automated ball mixing software. The ball is tracked by cameras and microphone levels are automatically adjusted to create an audio object with completely level and consistent sound pressure levels. This is so important in immersive audio because we can use audio objects to create different presentations.

“In 2018 we created a ‘pub mix’ presentation. In a pub there is no need for a crowd; immersion comes from the pub itself! There is already a crowd, there is beer, there are smells, cheering and complaining! But they still need to hear the ball kicks and the PA, the fan zones on the left and right, and see the camera feeds.”

While this presentation was just a test in 2018, Krückels says that it was used at Qatar in 2022. Because they can be combined in different ways to bring different groups closer to the action, audio objects are an example of how audio can be used to create immersive environments without relying on 5.1.4 formats. In one Manchester pub, for example, objects are being used to create a presentation for Manchester United fans in one room (with the United crowd), and Manchester City in the other.

Felix Krückels, sound designer, mixer, consultant & educator presenting at the SVG Europe Audio Immersive Audio Forum – Look back at Qatar

Making it work

Brie took up the story of how HBS took all that history and exploration, and made it work in Qatar. His detailed overview covered the competition’s entire redundant ST2110-30 network, strung between multiple Lawo UHD cores and MC²56 consoles, and incorporating two immersive audio suites and multiple galleries.

From each venue, HBS set up 3 x 64 channel 2110 streams, with one dedicated to immersive and archives, one dedicated to centralised  multifeed galleries in the IBC, and the other shared between the two. Brie explained in detail how each was feed was managed by the IP network across all eight venues and five different cities, and stressed that the complexity wasn’t due to distance but due to scale.

With 32 teams competing in 64 games, games were staggered to ensure that there was enough capacity to provide full immersive coverage using two immersive suites across all eight venues.

New way of mixing

Questions from the session participants gave both Krückels and Brie the opportunity to dig a little deeper into some of the detail and the reasons behind the decisions made. Such as why the height channels, for example, have the same weighting as the normal channels and why both are attenuated by 3dB.

Krückels explained why: “The reason why we chose this was because our height channels do not just contain noise, but content which is important for people to understand, like the music and PA announcers. The PA speakers are typically in the ceiling, and you can only hear these if they are mixed into the height channels.

“If we attenuated them by more than 3dB – say, 6dB or even 12dB – the PA announcements would be drowned out by the crowd noise and lost in the surround or stereo downmix. The opening ceremony, the national anthems, the speeches – all these are coming from the PA.”

This is a major evolution from how early immersive mixes were conceived, where height was added on top of the 5.1 mix; this is far a more inclusive approach, with essential elements placed in the height from the outset and folded down to the 5.1 downmix.

Another point was about objects, and the challenges of monitoring multiple presentations in multiple formats (5.1.4, 5.1 and stereo, as well as mono for mobile users). With the adoption of more objects creating more presentations, this becomes difficult. Ultimately, automation or AI might be the solution, and with broadcasters asking for more output formats, could content be enriched further with these technologies?

Mixing in real time

In the closing stages of the session participants who remembered to bring headphones were treated to a binauralised rendering of a series of audio mixes from the World Cup, where Felix generously explained the setup in real time and identified not only how each element contributed to the final mix, but how adjustments made a difference to the final output.

There is no doubt that the audio presentation has come on in leaps and bounds over the last 20 years, and there are many more ways to tell the stories at events like the World Cup. Krückels made the point that making these aesthetic choices can create more immersion, but asked whether it can also detract from the realism.

“Half the people here could argue that it’s not any more real; you’re doing fake sound, it’s not the football match which took place. Yes. But perhaps, no. Whatever replay we see may have taken place ten minutes ago. We are already telling a dedicated story for our home viewers.”

Like all the best television, it’s about storytelling and engagement, and immersion is ultimately about getting a delicate balance of everything, as Krückels discovered.

“During the (2022) World Cup I sent a binauralised stream to a researcher friend of mine in the UK and he replied that the sound on headphones is, “too big for my screen! It’s so immersive that it doesn’t fit with what I’m seeing in my living room!”.  It sounded an alarm. For the first time the audio is much bigger than the video.”

The next SVG Europe online audio event is on 12 July with the Cloud Audio Forum – Innovator’s dilemma

 

 

 

Subscribe and Get SVG Europe Newsletters