IBC 2018 Reflections: AWS’s Ben Masek on Why Machine Learning Will Revolutionize Sports-Video Production
Amazon Web Services (AWS) rolled out an army of demos at IBC this year focused media ingest, video processing and delivery, storage, content monetization, and – most notably – machine learning. ML coursed through the veins of AWS’s entire stand this year, especially when it came to sports. ML-fueled demos ranged from live clipping for highlights to speech-to-text for media management and subtitling to facial recognition for player tracking.
SVG sat down with Ben Masek, global business development lead for media and entertainment, AWS, to get a roundup of the jam-packed demos at the company stand in Amsterdam, as well as how he sees machine learning changing the game for sports-content creators, how the cloud can turbocharge media asset management, and why some media organizations are migrating their archive to the cloud.
AWS has plenty going on at its booth this year – can you highlight some of the key demos and themes here at the show specific to sports-video production?
Specific to sports, there are two bucket areas where we’re seeing a lot of activity with our sports customers.
One has to do with a lot of the traditional video workflows, which are obviously moving towards OTT. We’re doing a lot around live capture in terms of providing a variety of tools to process your content and then make it available for live streaming or for VOD. Last year, we launched AWS Media Services, which is key to allowing even smaller- and medium-size operators to build out any of these advanced video workflows. We believe that is [driving] a democratization in which the barrier to entry for these advanced capabilities is lowered.
The other bucket, which I find very exciting, is focused on fan engagement, personalization, and monetization through machine learning. A lot of the machine-learning [tools] that we are demoing here are changing the game in terms of how our customers can get customers interested in their content.
What are some examples of these machine-learning demos at the booth?
One is centered on generating auto clips around a goal. We can now detect when a goal or a yellow card has happened and we can then trigger a variety of [options], such as going back 15 seconds and then going forward 15 seconds from when that goal happened to create a 30-second clip.
We have another demo with UEFA where we have trained the model to track the faces of all the players throughout the game. Something else we are also developing is tracking based on object-content recognition (OCR); [for instance,] in the case of an NFL player, you don’t see the face because they have a helmet. We could track the number on their jersey instead of the face.
The biggest message we’re really trying to convey to customers is that you don’t need a PhD in machine learning or data science to [launch these workflows]. We have a lot of customers who now can come in with very little media technology background and can figure this stuff out quickly. As part of these solutions, we’re even showcasing AWS CloudFormation tablets that package the different services together and show to the user exactly how these demos can be utilized in the real world.
AI and machine learning have been buzzy topics for a couple years now in M&E. Do you feel like these tools are now coming of age for real-world use?
I really do think so. I believe, from an evolutionary perspective, we’re beyond the technology-demonstration phase and the feasibility of what these [systems] can do is understood. Now, we’re actually working with customers to create real-life use cases. As we’re showing these different demos, sports and other media customers are realizing the value immediately.
We’re now at that stage where we can demonstrate the capability of [our machine-learning tools] and customers are asking us if we can personalize that capability for their specific needs. And, in those situations, we’ll work with them and provide professional services to figure out how to take on their specific workflow.
How are you seeing AI-powered speech-to-text services advancing and how do you believe these tools can serve sports users?
Our [speech-to-text solutions] are advancing very quickly in terms of capability. Some of the same customers that we were talking to 9-10 months ago when we first started to push these have come back in the last 2-3 months and are blown away in terms of how far we’ve come.
In terms of actual usage in the field, we now have customers who are utilizing these speech-to-text transcribe capabilities – at the very least, as a first pass for things like closed captioning and subtitle generation. These customers still have humans in the workflow to help make sure they can get to the 95% or 100% accuracy mark, but [speech-to-text] is helping out significantly on the first pass.
In terms of subtitling, now you can actually also apply translate. So, one of the other use cases is when media customers want to scale out and globalize their content to other markets, they can do that very quickly now in different languages using the transcribe and then translate mechanism to create subtitles in other languages.
Another demo that we’re beginning to showcase also allows for the text-to-speech using our Amazon Polly [automated voice] service. Now, you can take the initial source coming in in English and output an actual voice to other languages. So imagine the possibilities that creates if you have rights to sporting event and you want to [produce] it for different regions.
How are you seeing media-asset management evolve in the M&E sector and how are these organizations embracing cloud-based workflows?
For several years now we’ve been working with quite a few DAM and MAM vendors who now have ported their capabilities onto AWS. In some cases, they’re even rethinking their architectures to make them more cloud native in order to work directly with Amazon S3.
One of the biggest pushes in the last six months in regard to MAM goes back to ML. The big advantage of being [in the AWS Cloud] is you can actually run image and video recognition as well as transcribe the content coming in and be able to automatically extract a large amount of that metadata to populate your database. It just gives the AWS Cloud that much more of an edge. We have another demo here that’s showcasing exactly how that can be done.
We’re now seeing an enormous amount of interest from folks who have these gigantic tape libraries with a MAM in front of them. There are two reasons why they are interested in moving over to the cloud now. One is to be able to better handle their storage needs, especially with the incredible growth [in content] that they’re having to face. And then, more importantly, they can actually bring that content onto AWS and take advantage of these ML tools to start extracting more metadata. So now they can get more value out of the content because they have greater searchability, which means they have greater potential for future monetization. And then back to AWS Media Services and the democratization of these great video tools, once they have that content on S3 or even in Glacier, they can more easily tap into all these processing and packaging capabilities as well.
They also now have ability to rapidly prototype new workflows that tie into a MAM. There are additional workflows in your MAM that might be used for distribution or other purposes beyond just the base capability of the MAM. With [AWS], anyone can now start prototyping and building out a whole variety of new media workflows without having to hire a media technologist or an army of developers. That’s exactly what we’re seeing today. Once again, we’re reducing the barrier for entry for a lot of the smaller- and medium-size shops.
While large media organizations are moving more of their archive into the cloud, most are still maintaining a hybrid cloud/on-prem model or even a multi-cloud model. How are you seeing your media customers’ use of the cloud evolve as an archiving platform?
I would say we are seeing three styles of potential uses in cloud.
One is just a complete rethink of somebody’s MAM, where a customer comes to us and says they can’t handle the volume, flexibility, and business needs, so they want to totally rebuild their MAM as AWS-cloud native. And in those situations we will work with existing MAM partners to build that out.
Category two involves folks who are just running into a storage capacity or an infrastructure problem, so they just want to move their storage into the cloud as quickly as they can. That goes back to your hybrid solution where they actually continue to run their MAM software on-prem, at least initially, but then move their storage over [to AWS] as quickly as possible.
Type three is they have a MAM running on-prem and they just want to offload it to the cloud. They realize there’s nothing differentiating about them running on their own infrastructure, so they’re just going to offload it entirely on cloud. They’re going to use the same vendor, same system, but they just won’t have the infrastructure [on-site].
So, I would say it’s a mix. Usually, we will begin talking to customers to understand what it is they want and many of them just want a quick fix to help with their infrastructure ASAP. But then we start showing them what’s possible [in AWS Cloud] moving toward the future, such as machine-learning capabilities for auto extraction of metadata. And in many cases, it will make sense to some of these customers to actually rethink their [operation] to be more cloud-native rather than a plain old lift and shift. That way, we can offer them a lot more runway as far as technology needs for the future.