3 min read

Spotify for 'successful line breaks'

There are a lot of disagreements about data, but the one thing that most companies seem to agree on is that you can never have too much of it.

You could say that we're living in an 'age of abundance' of football data - but only if you wanted to look horribly out-of-date in three years. There will undoubtedly be more and more datapoints added to our data packages as time goes by, which makes it all the more shocking when you see the amount of metrics that some companies currently advertise themselves having. Hundreds? Ha, smol babies: try the four figures.

Having so many to choose from is, in some ways, very nice. Metrics can be much more specific on a range of different criteria: phase of play, action type, area of pitch, relation to opposition structure. But abundance introduces its own problems: in this case, discoverability.

Think about music (or podcasts or TV). There are so many people creating so much of that stuff, that it can be hard to find things you like. (Or, if you're a creator, hard to put your work in front of an audience who might like it). Take that situation and abstract it out slightly, and you have a case of 'how do I find the things I want for my current circumstance'.

Spotify, YouTube, Netflix, and all their various competitors all do this with algorithms. But this isn't the only way, nor the original way, of doing this job. In the olden days, there were these things called radio stations and TV channels, which chose what to put in front of their audience. This is similar to the various bits of default software that companies - including the data providers - offer to their customers.

Now, you could listen to stuff that wasn't on the radio, but let's be honest, most people mainly just listened to what was on the radio. That might be because they're un-curious, it might be because the radio music is what they're familiar with, it might be because the radio station's choices were genuinely really good.

You could extend this metaphor even further, but I'll spare you that. Manual curation naturally limits what is put in front of an individual, relies on smart experts doing the curating, and has a tendency of keeping people in the status quo. So are algorithms the way? Is that how people will make sure that the right dozen metrics, out of a pool of thousands, get put in front of people? Is that the answer to the coming (or present) football data discovery problem?

There are, unsurprisingly, problems with this.

The first is that it isn't exactly an established best practice for data providers to create good-quality metadata about their statistics offerings. The second is that building an algorithm on usage data relies on a large amount of users. And giving someone a 'bad' recommendation in a professional football environment matters a lot more than if Netflix pushes a dud TV show in front of you.

You also get into a question about what the algorithm - or, more likely, algorithms plural - are meant to do. Think of music streaming, which does this more and better than Netflix or YouTube. Music streamers are well aware that people have different moods and situations, in which they'll want to listen to different things. But do you want to listen to 'Rock' or do you want to listen to 'Energy' (or 'dirty rock happy indie thursday evening')?

So, if you're logging onto your SoccerPlatform2027, will you want to look at 'Attack' or 'Final Phase Chance Creation' or 'sexy vibe xT aura farming'? And should the algorithm tilt towards 'things the team did well this weekend', 'things they could improve', or 'things most correlated to success'? (and how much would you trust the company producing the algorithm to accurately surface each of those things).

At a guess, an algo (or several algi) would be useful but would need to be smartly limited in scope and be an addition to curated features, rather than replacements. That's all for now, otherwise I'll tangent onto talking about the difficulties in dealing with the many slightly different terminologies and conceptual approaches that football practitioners have, like 17th-century variants of Protestantism. (Honestly, you give an inch on translating juego de posición into the common tongue...)