Redefining possibilities: It is time to embrace AI, says EVS
By Olivier Barnich, EVS head of innovation and architecture.
Artificial intelligence (AI) is emerging as one of the most transformative forces in our industry, enhancing the creativity and quality of the content that is being produced. Over the past decade, EVS has played a leading role in this evolution, particularly in recent initiatives aimed at enhancing replays through the use of generative AI.
The primary goal is to free production teams from the limitations of the set video cameras used to capture events. For instance, using real time AI processing for ‘hallucinating’ frames between existing ones (frame interpolation), can turn any camera into a high frame rate device, enabling smooth replays from any angle on a production. AI’s capabilities extend to eliminating motion blur for crisper slow-motion replays, or simulating the bokeh effect, typically achieved with specialised lenses, fostering a deeper connection between the viewer and the subject. Additionally, intelligent digital zoom powered by AI not only assists replay operators in tracking and detection but also delivers images of the highest quality.
Beyond live operations, AI also contributes in many ways to near-live and post production. Firstly, all the above-mentioned filters based on generative AI can be applied in post-production to refine content. Secondly, we’re seeing a surge in AI-backed features within popular post-production tools, relieving editors of tedious tasks and allowing them to focus on more creative and artistic activities.
Automatic speech transcription in non-linear editors is a standout example. Editors can easily edit an interview by manipulating the transcript. This, in turn, allows AI to automatically generate a rough cut of the interview video, saving substantial time for editors. Speech-to-text technologies have witnessed increased utilisation in 2023, with applications such as automated subtitling and automatic transcription in multiple languages streamlining once time-consuming tasks. Additionally, image generation from text prompts has become a prominent feature in various tools. In near-live workflows, AI plays a pivotal role in generating metadata for content databases, enhancing the organisation of vast amounts of content and enabling the swift retrieval of relevant material.
A glimpse into the future
Looking ahead to 2024 and beyond, AI promises more exciting use cases. The capability of image generation from text prompts is expected to evolve into the generation of video and 3D assets, providing content creators with more creative freedom. Significant enhancements are also on the horizon for the indexing and searching of content within expansive databases. Finding the needle in the stack will become possible.
Moving beyond mere metadata creation and indexation, AI’s evolving ability to understand audio and video content will empower operators to search for spoken words, find visually similar shots and even search for specific types of shots, such as a ‘close-up on a face’ or a ‘wide-angle view of a crowd’. Even more sophisticated search abilities will be available, including facial recognition based on a single example and the ability to pose natural language queries for specific video content. Imagine the potential to issue a search query in natural language like ‘find me shots of exploding cars, with a camera travelling to the left’.
On live sport events, augmented reality graphics will no longer be limited to post-game analysis, as AI will enable real-time generation without any operator intervention. Another exciting prospect lies in the future possibilities brought by Neural Radiance Fields (NeRFs) for live sports. This 3D reconstruction and rendering technique allows photorealistic rendering of scenes. While current techniques are mostly limited to static scenes, ongoing research focuses on enabling 3D reconstruction and rendering of moving scenes. This opens up endless creative options, such as being able to replay an action from the point of view of a player, and more generally placing a virtual camera wherever the director wants it to be.
Moreover, the ability to broadcast live events as a 3D virtual world opens the door to immersive augmented and virtual reality broadcasts. This is particularly impactful in catering to the preferences of younger audiences who crave more interactive and immersive content experiences.
Driving collective progress
Amid the rapid pace of technological advancements, the role of broadcast operators becomes pivotal. With AI technologies permeating various aspects of live broadcasting, from real-time analysis to automated content creation, operators must continuously adapt their skill sets to effectively harness these innovations. This ensures that broadcast operators are adept at understanding, managing and optimising the diverse array of AI tools at their disposal.
Equally crucial is an awareness of the legal implications associated with the data used to train AI tools and the intellectual property rights linked to AI-generated content. This legal literacy safeguards against potential pitfalls and ensures ethical and compliant use of AI technologies.
Finally, to steer the trajectory of AI development in alignment with industry goals, the broadcast sector should actively cultivate a strong relationship with academia. A strategic approach to achieving this involves engaging and motivating researchers through the publication of datasets.
This proactive collaboration will encourage researchers to address challenges and solve problems directly relevant to the broadcast industry, ensuring that advancements in AI applications align with the specific needs and requirements of our sector. Such a collaborative synergy is not just beneficial but imperative if we aim to unlock the full potential ushered in by this new era of AI.