Karl Malone switches to immersive audio and keeps remote commentators connected
Karl Malone, NBC Sports and NBC Olympics, Director, Sound Design, claims that the 2020 Olympics are truly the audio Olympics. He jokes that he says every Olympics, but it’s hard to argue with that this time around: next-gen immersive audio, with 5.1.4 (think regular surround sound with four extra channels over- above to give height to the audio), is the norm at every location. And remote commentary is more important than ever. In other words, the efforts are impressive. He spoke to SVG in the NBC Olympics listening room at the IBC in Tokyo.
I wanted to start with your perception of the soundscape of these Games. Is there something that sets you apart?
In this regard, the focus was on the lack of crowds and the sound, as well as new technologies of immersive audio, as well as OBS providing 5.1.4 audio and a brush to make all our sites in. immersive audio. But the lack of crowds tends not to be a problem because we had a lot of crowds, especially in swimming and gymnastics where the other teams come in, and that’s great. That’s not to say that there aren’t empty sounding sites, like badminton or weightlifting, but, certainly for the mainstream sites, it does. has become a kind of non-problem.
Michael DiCrescenzo, Senior NBC A1 and Audio Design Engineer, mixed the NBC Primetime show in Dolby Atmos and create immersive mixes to ensure consistency from sport to sport. Pierre Puglisi mixed NBC golf coverage into Dolby Atmos but from a truck in Stamford, CT. The technical complexity of these Games was staggering.
Sonically, this has been sort of the best of both worlds, where you have a crowd and you can choose the details of the sport. It was always going to be the Games that were Audio Games, and you are going to hear these Games like you’ve never heard them before. You hear footsteps in athletics, and it’s not in a silent stadium: there is the PA, and there are people applauding. It all takes a day or two to tweak, but you hear things you’ve never heard before, and it’s fantastic.
Sky London’s athletics, golf, gymnastics, basketball, mixed volleyball and swimming and beach volleyball all sounded fantastic. Specialized cameras have helped a lot. There is a cable camera that goes over the top of the swim, and that really helps pick up the sound, the “wet” sound of the swim.
I loved the skateboard where you could hear the board coming down the rail.
Yes, skateboarding is good, but we are a bit challenged by a few sports like BMX. The ramps are smooth and the rubber wheels are soft, so you’re trying to hit those kinds of sounds. In 3 × 3 basketball, the court is different from regular basketball courts, where you hear squeaks and ball bounces, so you have to try to shoot them. [sounds] outside too. With these new sports, you want to promote this sport to people in an audible and visual way.
It’s been tough, but that’s the fun part: how do we get the most out of this sport and bring it home to tell the story?
Any advice for someone who hasn’t worked in immersive audio on what to expect and how to do it?
We’ve always talked about having an ambiance basecoat for heights. I think it’s always been very successful with us, and that’s what OBS has provided us with: a nice base layer to build on.
We used 16 channels wide with our edits and audio from the truck, and we came up with it knowing that was our plan. Sixteen audio channels gave us the first eight channels for our standard 5.1, plus we’re making dummy headsets and clean announcer tracks to make edits easier. The last eight channels are totally immersive: we take the four channels in height, and we have a stereo mix for the last two pairs, which we were able to place in certain areas of the stadiums to add to this base.
For example, we would use these pairs to isolate a certain section of fans in the crowd and place it in the heights with the base. It hasn’t been confirmed in these Olympics, but technically and from an engineering perspective we’re getting that through the modifications, and it’s been very successful. Running this 16-channel workflow and getting editors to edit on 16 channels is something we’ve never done before. And that has been very helpful to us.
Maybe that’s how everyone will do it, but, for us, it’s about having the base layer and then building on that base layer by being able to add certain things to put your ear at ease.
When someone is editing with 16 channels, can they hear the pitch signals?
No, they don’t hear them in an immersive aerial setup but rather as isolated runways, which they can solo and QC. Ultimately, they pass through them from trucks or sites. Editors do their normal 5.1, and everything is mixed live in the audio control room for the final mix.
Ffinal question on the immersive field. What does it mean to see spatial audio getting its due, even seeing commercials on Apple TV about it?
I think anything that can add to the experiences will be worth it. I know I say it all the time, but the immersive audio makes the picture more beautiful. If you add spatial audio and immersive audio to our production, it makes the whole production much better and more enjoyable. And it’s not that the pictures aren’t great; they look fantastic in HDR. But it’s all part of this package that you give people: the best audio quality and the best video quality gives them the best possible product.
Let’s talk about commentary. There is a lot of remote commentary going on for NBC around the world. First, audio lag is always an issue, so where are you at?
It’s probably around 120ms. It’s really nothing in a voiceover booth. But we have tennis, football, athletics, and baseball advertisers and analysts at Telemundo, Sky in London, here, or Stamford working with a co-commentator in Tokyo. We have two spy cameras so that we can make the analyst and the analyst see each other from two different continents, and no one will ever know that they are separated by continents. It’s the kind of magic behind the screen.
And we have four remote locations for indoor volleyball, basketball, golf, and beach volleyball, and those production teams are either at Sky or Stamford while the comments are here. It is built on the LANCE Dante system and Calrec RP1, which is a virtual console that allows you to mix-minus locally so that there is no latency for advertisers. Everything is connected to Dante and integrated via MADI, then controlled by the Calrec consoles which are located in Stamford or Sky. It’s a seamless workflow, and no one can tell where it’s produced from, with, again, the trucks on different continents.
Looking forward to Beijing, is there anything you want to change?
We have definitely proven the success of the immersive audio workflow between editors and transmission, recordings for trucks, making sure all 16 channels go through. All we needed was material with the crowds so that we could immerse people in a place, where right now we are focusing on the athletes. But the crowd is the key to capturing the passion that comes with the sport, and that’s what we’re really looking forward to.
What else strikes you about these Games?
I think the efforts of friends and family have been fantastic. We have cameras and watch parties in America where families can interact with their sons and daughters. It brings the Games to athlete families like never before (to watch a clip, click here) and gives that emotion and passion that you kind of miss by not having the crowds.