The menace of time: Hitomi Broadcast checks out latency in sports broadcasting
By James Robinson, Hitomi Broadcast director.
In the days of analogue television, we never had to worry about latency. Everything was as close to instantaneous as you needed it to be. Of course, this meant keeping all signals in perfect sync for smooth cuts between pictures. Sure, it wasn’t all smooth sailing – we had our fair share of technical hurdles, like juggling subcarrier phase issues and tackling cross-colour – but we managed.
When we moved to digital, things started to change. Suddenly, frame stores were affordable, and with de-interlacing adding frames of delay, the landscape shifted. Aspect ratio conversion and graphics generation, not to mention digital video effects (DVEs), introduced more delays into the production chain.
Despite this, the systems were meticulously crafted to time up these paths, and engineers, at least in theory, had a good grasp on the timing throughout the system. When compression came into play, largely through hardware appliances and satellite links, the timing remained mostly predictable, maintaining a semblance of control in this evolving digital terrain.
Massively more complex
That world still exists to a degree inside OB trucks and facilities, although SMPTE ST2110 has added to the complexity. Don’t get me wrong, SMPTE ST2110 is a great accomplishment, and one of its major benefits is the ability to handle video, audio, metadata and timing separately. But video is a massively more complex signal than audio, so processing it takes significantly longer at each stage, leading to differential latency which grows as the signal chain extends.
There is a tendency to introduce a frame of delay at every hop through a SMPTE ST2110 system, often synchronising every time a signal is encapsulated. At least, in principle, SMPTE ST2110 systems are pretty much deterministic, just a bit more latent and a bit more complex to set up.
Unfortunately, principles are not always stuck to; there are many SMPTE ST2110 to SDI bridges that don’t fully adhere to specs, leading to AV sync issues that are as inconsistent as they are frustrating, often due to manufacturers rushing products to market without thorough testing of detailed timing behaviours.
Then there’s the internet, our new best friend in video transmission. It’s a budget-friendly alternative to satellite links, and COVID-19 really shone a spotlight on its cost effectiveness. Now, we can manage remote productions with ease, pulling feeds from cameras worldwide, with commentators contributing from virtually anywhere. The internet has revolutionised production, slashing the need to transport people and gear across the globe. We’re moving video instead of vans, but it comes with its own set of challenges.
Cat on a hot tin roof
But here’s the rub: the internet is about as predictable as a cat on a hot tin roof. Data is chopped up, hurled into the digital void, and we cross our fingers it comes out the other side. Timing, order, even delivery – all of it is up in the air. We have to wrestle this chaotic beast with protocols that recover lost packets, rearrange them, and try to churn out something coherent.
This leads us to the dreaded buffering. To ensure a smooth stream, the receiver’s got to have a buffer, a sort of digital waiting room, big enough to handle the data deluge. The size of this buffer depends on the network’s whims, and you guessed it – buffers mean latency, a wild, unpredictable kind. Technologies like SRT and RIST do their best to add some reliability and predictability, but they’re not there yet for millisecond precision. Other tech sacrifices quality for speed, and some even adjust their buffer sizes on the fly, leading to variable latency.
Take remote commentary as an example. The video goes out to the commentator via the internet, with their response sent back the same way. This can add several seconds of delay. We can’t have commentary lagging behind the action, so we delay the main feed to match. But often, that delay is a moving target, leading to those awkward moments where a goal is celebrated before it’s seen on screen – a definite no-no.
The trick is understanding these latencies and their uncertainties and picking the right tech for the job. We’re in a transitional phase, and the perfect solutions are still on the drawing board.
When it comes to lip-sync and latency, knowledge is power. Systems can be tweaked, workarounds implemented, and expectations set to ensure everyone’s on the same page regarding overall latency.
In today’s world, where engineering resources are stretched thin, old-school manual syncing is out of the question. Instead, we need to rely on automated quality control tests for lip-sync and latency, before every production due to the nature of digital and IP latencies.
Fortunately, there’s a silver lining. We’ve got tools that can measure latency and lip-sync precisely, with minimal human intervention. This means every setup can be perfectly aligned and ready to roll when it’s showtime, ensuring top-notch quality and synchronisation.