Videogram: Giving Artificial Intelligence a sporting chance

Artificial Intelligence and machine learning are slowly creeping into sports production in such areas as automated shot selection or highlights compilation, but the challenge of then getting viewers to watch the productions can also be helped by AI.

Videogram, a small AI company founded in 2012, is backed by NTT DoCoMo, Samsung and Turner Broadcasting, and is using AI to make content easier to discover and share, whether for sports, music or Japanese anime. It is also working with recorder/monitor manufacturer Atomos on an interesting new stock footage venture that it hopes will enable creators to make money more easily.

A typical use of its AI engine is to recommend and personalise content, to encourage viewers to look beyond the obvious. For example, it has recently launched an anime site in Asia, to address the problem that while 20% of characters are very popular, the rest are not – often because they are not discoverable.

Videogram’s founder and CEO, Sandeep Casi

This is partly because the keyframe viewers see is chosen by the publisher, but there will be plenty of other possible keyframes that might appeal better to individual users, which Videogram helps to identify and personalise. “If it doesn’t entice me, I don’t click on it,” said Videogram’s founder and CEO, Sandeep Casi, who has previously worked with Industrial Light and Magic, Fuji Xerox Labs, Fujifilm, and as a consultant for the BBC and CNN.

“We are trying to solve the problem of how to cater for every individual’s interests, so provide discovery layers only for their interests.” The AI engine looks at what people like and gives them keyframes that match that.

Its technology originated a decade ago at Xerox PARC, where Casi used to work, “but there was no market for online video ten years ago.” He licensed the patents and Videogram has been training its models on different genres of content for five years, such as music videos, movie trailers, sports, food and fashion, and now with Atomos it is working on stock footage. The more data it handles, the better it gets.

Instead of presenting an online viewer with a single frame from a video it shows a storyboard, which it creates automatically and is different for each viewer, based on their behaviour (what they viewed before and any preferences).

It then has dynamic access to the video, to allow “snacking – so you can consume the video you are most interested in,” he said, similar to the way people consume a newspaper or magazine. “Content producers expect you to watch all the way from beginning to end, which is why there is so much drop off” among online viewers. “If you give consumers what they want, there is much more engagement,” he explained. While YouTube gets an average of 15% click through, Videogram is getting 45-60%, due to having multiple entry points.

It also makes it easy to share short clips from a video to social media. “It’s not cutting the video, it’s just a bookmark.” The video will then start from that frame, offering instant gratification.

Matching people to platform

The AI uses dynamic learning, making notes of an individual’s preferences. It can then compare them to others, so that if someone has 60% similar preferences in music or sports videos, it will base recommendations on their favourites – with different matches for different genres, as well as for your region and demographics.

Typically recommendation engines match people to the platform (as with Netflix), whereas Videogram matches viewers with other users and is much more granular in its approach, so it might match to one set of people for rugby and other sets when recommending cricket or skiing content.

It also uses real-time analysis to give constant feedback to content publishers, so they can see what is happening to their content, what is most enticing for different demographics and regions, and recommending what they should be doing for their next videos if they want to reach more people.

It also matches its offering to the device it is viewed on. So, while someone on a computer screen might see a storyboard, on a phone they will get a short animated gif cycling through the possible keyframes – and these frames change in line with trends or popularity. So, immediately after a match, you might see the goals and saves, but the next day, there might be clips of some incident in the match that is now trending, to give you a reason to have another look. “It is constantly changing itself and resurfacing itself to make sure that more and more people are clicking on it.”

If the AI knows you like a particularly player, such as Messi, it will start to surface all the clips he is involved with, as it can recognise players. It can also recognise goals or fouls, or anything exciting. It spots when the decibels go up and can track frame rate to see slomos and can then look at it to see if the goal posts are involved or a red or yellow card, he explained.

For live sports (including eSports), it indexes the feed in real time to identify clips, such as a goal, which can then be shared. This Videogram Live service is essentially a “very smart DVR”, and if you are a fan of Australian rugby (as Casi is), that would be highlighted – “so that I am consuming more of the content that I am interested in.” Everything can be shared with social media to increase traffic. The event can remain viewable afterwards, so can have a long life – depending, of course, on having the rights, and Videogram only works with rights holders.

It recently started working with cricket in India, and is in discussion with football clubs and organisations in Europe, and with eSports companies. “Anyone who has rights to a sport we can work with,” he said.

Videogram is based in Tokyo, with most of its focus in Asia, and is only now starting to work in Europe and the US. It currently gets about 100million views per month in Asia covering entertainment, sports and events. Cricket has been its main sport so far, but with rugby coming to Japan in 2019 that is an area he is seeing increasing interest in.

“Sports is a very difficult market to walk in to. It has multiple layers of licence holders, and it is very difficult for a small company like us to approach someone like FIFA,” he said.

Finding footage with no human intervention

It is now working with Atomos on a stock footage engine, using Atomos data and a selection of users to train it, although it is less complex than sports or music, as there tend to be fewer elements involved.

For example, if you need footage of a red car, it will find all the places a red car appears in all of its video. “For anyone using stock footage today, it is often difficult to find what they want,” but this can find it, without any human intervention. A search for “mountains and river” rapidly found several clips of rivers running through wooded and mountainous areas.

Besides visual analysis, it can also do speech to text analysis. “We can get very accurate data of what the subject is saying,” he said. It can also reference closed captions to check the context, but they don’t rely on it as he has found closed captions to be unreliable.

He hopes that broadcasters and other content producers will use the system to enable them to unlock unused assets they have that would otherwise need expensive and lengthy human intervention to catalogue. With Atomos, assets can be uploaded and analysed by its AI and rapidly made available to market, with no need for humans to annotate them.

Another problem with a lot of content, including stock footage, is what happens once it is in the hands of users, and how to track your digital rights. It could be sent on to another person, and viewed or used by them without paying. Videogram will use blockchain so that if it is shared it has to be authenticated and the content creators paid.

This can also allow for pay-per-view stock footage, creating individual or personalised channels, based on keywords, so that someone who wants to watch snow scenes or landscapes set to music could view on subscription (or ad supported) – these sorts of channels are surprisingly popular on YouTube. “We are trying to make more money for the content creators,” he said. “We want assets to get monetised where they are not getting monetised today.”

If there is a lot of demand for something, Videogram can share that data with creators, so they can shoot more.

“We know which stock footage is popular and is being bought,” he said. It also makes it easy for buyers and sellers to communicate, via private messages or public comments, and has clearly been influenced by social media. “We are trying to democratise content creation,” he explained.

It has finalised the first version of its system with Atomos, which is currently in a closed beta test with influencers. Casi hopes that they will be able to launch it officially at CES in Las Vegas next January.

The main thing still to be settled is how Videogram and Atomos make money. It might be a free service for creators, or just for Atomos users (with a subscription for others), or they might take a small fee or percentage. The creators will then be able to set their own prices, which can differ by region, or other parameters.

Some of the possibilities being discussed include ways of making use of lens and colour data, which isn’t usually searchable today on any site. He wants to enable someone to easily find footage that matches their looks (or Look Up Tables). To do this, users could upload the clips they want to match, the AI will analyse them, and look for the same characteristics, and can then create multiple edits with different suggested stock footage inserted, so buyers could easily compare them.

In future, Atomos users may be able to search for stock footage from their recorder/monitor (which is, after all, a small computer), which might save them going to another location for a shot they can easily pick up online, or mean they don’t have to come back another day if they are faced with the wrong weather.

Subscribe and Get SVG Europe Newsletters