Do VARs dream of electric football commentators?

It's not often that Martin Tyler and the MIT Sloan Sports Analytics Conference are mentioned in the same breath, so savour this newsletter while it lasts.

At this year's Sloan, two folks from Amazon and one from Fox Sports got together to publish a paper, titled 'Sports narrative enhancement with natural language generation'. Their aim was to turn bits of sports data into sentences that 1) made sense 2) read well.

While it's fairly straightforward to put data into templated sentences, it's much harder to add in variations and keep it all sounding human and un-robotic. At the same time, you don't want to lose the meaning or, worse, accidentally change the meaning of what's being said.

It helped them — Henry Wang, Saman Sarraf, and Arbi Tamrazian — that sport is fairly formulaic. There are set periods of play, set numbers of players, on-field officials to make instantaneous decisions. As much as we like to think of sport as being played on fields of possibilities, there's quite a tight limit on what can actually happen. Nobody's discovering penicillin at the Camp Nou.

This, after all, is why the sporting video games we all know and love are able to have in-game commentary that more or less works for anything you do. Enter Martin Tyler.

"I'm reacting to prompts from the game-makers," he explained to VICE in 2015, talking about recording for the FIFA series. "They'll say, 'It's 3-1, and they've just scored, and there's five minutes to go – what would you say?' So, I go, 'Ooh, Five minutes to go and they've just scored? Maybe they've got a chance now.' And then they'll ask me for another one[...]And then they'll ask for another one.[...]And then they'll ask for another one, and so it goes."

When you're EA Sports, with the money that the FIFA franchise has to burn, you can afford to get someone to spend a few days recording all these variations, which you can then pick and play at random when the time comes in the game. But for anyone else this would quickly become unfeasible and unmanageable, particularly if you're trying to create templates for data to slide into. It's not flashy, but a big part of the Sloan paper's value is the scaleability.

Now, if we're already used to 'automated' human commentary in video games, and we have techniques to create natural-sounding sentences from data, in a sport that has so much of it... how far away are we from this all being combined in real life? How much of a stretch would it be to have properly automated TV football commentary?

Automated reporting (or 'AI journalism' if we want to use the proper buzzwords) is already firmly in use in written media. The scope is still fairly limited, but, for example, the BBC have been using a partly automated system to report on NHS data since 2019.

Sports results were actually one of the first widespread uses of these techniques too, with the Associated Press in the game since 2016. It's for exactly the reasons already mentioned: regular events, ready-made data, largely formulaic outcomes.

The article linked above has an extract of a machine-written AP story, which gives a sense of what the Amazon and Fox Sports engineers were trying to improve on:

STATE COLLEGE, Pa. (AP) — Dylan Tice was hit by a pitch with the bases loaded with one out in the 11th inning, giving the State College Spikes a 9-8 victory over the Brooklyn Cyclones on Wednesday.
Danny Hudzina scored the game-winning run after he reached base on a sacrifice hit, advanced to second on a sacrifice bunt and then went to third on an out.

It's to sportswriting what Huel is to dining experiences: gets the essentials and, granted, technologically impressive, but you'd feel cheap gifting it to your mother.

However, if, thanks to software engineers like the ones who wrote the Sloan paper, this could be made to sound a lot more human, these words could be run through an artificial voice generator (which are themselves getting more lifelike) and put straight into a broadcast.

What this ends up looking like (or, I guess, sounding like) could depend on commentary style too. Some approaches are largely descriptive: player names, general locations on the pitch, basic descriptions of shots or notable passes. With a small enough delay between the action happening on the pitch and the data being processed*, this could easily be automatable.

*(note: this is probably a fairly big assumption, although data companies will know, and boast about, their latency figures better than me; even if it's not possible now, though, if there's the potential for money to be made, someone could always set up near-instant 'commentary data' specifically).

On top of that, you could even put together a separate language generation set-up with shouty, enthusiastic voice synthesiser aimed specifically at big moments. When goals are remarkable it's mostly because they're from long-range, after a long passing move, or came at a dramatic point of the match. All of these traits are easily identifiable in data.

Imagine the scene. You've finished a round of bug fixes on TylerBot v3.1.6 and are setting it loose on a match. "Manchester City on the edge of the box. Attempted pass cut out. Touch from Balotelli, out to Sergio [TylerBot identifies a shot has just been taken] Agüer- [TylerBot identifies that a league title-winning goal has been scored; Big Moment mode is activated] -OOOOO!!! I swear you'll never see anything like this ever again!"

Or something to that effect.

Martin Tyler, the real Martin Tyler, was replaced as the voice of FIFA video games in 2020 (although the replacement (this time around) was another human, Derek Rae). In truth, even if the FIFA games decide to jettison the carbon-based commentators, the real sport of football is random and unexpected and instant enough that humans are going to remain the preferable option. Moments like the Agüero goal, or Eric Cantona's kick into the crowd at Crystal Palace, or, for very different reasons, Christian Eriksen's cardiac arrest need a real person to describe them appropriately.

However, there's an increasing amount of sport being broadcast on an increasing number of channels, and companies will always try and do things on a budget. If a small league wanted to air every single one of its matches somewhere on the cheap, without paying a full slate of commentators every week, could they use automated commentary for some of the games that won't have as large a focus on them?

Another idea (which will feel a little less scary for existing or aspiring freelance commentators — sorry folks) would be to pair a human commentator with an automated co-comm.

If the human on-mic has a more traditional play-by-play style, the AI could be set up as an insights and fun fact-based colour commentator. Or, if the real person is more interested in relaying atmosphere or their own analysis, maybe the AI could be the play-by-play and could be turned off/muted when the human wants to chip in.

This, rather than replacing the Martin Tylers of the world outright, seems like the most likely use of automated commentary, if it's ever going to happen. Perhaps it's something that some of the mid-sized leagues across the world, vying against each other for global attention, will look into.

And it might be closer than we imagine. IBM demo-ed a prototype system back in 2019, although the sporttechie write-up reports that it said the same phrase — "Here comes the cross!" — four times.

With time, the systems will be able to get better. Hey, Siri, the fans are spilling out onto the pitch; is it all over?