As Arsenal maintain a five-point lead in the Premier League title race, why do prediction models differ so widely?

Arsenal fans, how confident are you feeling about the Premier League?

The state of play right now is like this: Mikel Arteta's team have a lead of five points over Manchester City, and each team has 12 games left to play. Both still have games against Liverpool, Chelsea, and a strong-performing Brighton to come, as well as a huge clash against each other. Both are still in their European competitions.

We'll all have our own gut feelings about how things might go, but maybe you'd want to check some kind of statistical model to see if it matches up with what you think. You are, after all, the kind of person to read an analytics newsletter. So you go to US pollster celebrities FiveThirtyEight, who at time of writing give Arsenal a 54% chance and put City on 44%. Then, for a second opinion, you check Opta. They put things at 50%-49%. Hmm. And for a third opinion, maybe the Euro Club Index: 51%-47%.

Note: it doesn't look like any of these three had updated since the Liverpool-Manchester United match at time of writing on Sunday evening, a result which will have reduced United's slim title chances and marginally increased Arsenal and City's.

One of these models makes the contest look like it's on a knife-edge. One gives Arsenal a bit of breathing space. The other is Goldilocks-like right in the middle. What gives?

There are, as far as I can tell, two basic parts of this modelling: determining how strong each team is, and simulating what happens when two teams of strength X and strength Y play each other. Differences can come in both parts.

All three of the modellers mentioned give Manchester City a stronger rating than Arsenal. I think this basically matches with what most fans think too, that City have a better squad and better depth available, that Arsenal could drop points but might not have enough time to drop five more than City do. The way that they rate the teams is different though, and this looks like it could explain part of the difference in the predictions.

FiveThirtyEight's system (which gives the 54%-44% race) starts each season afresh. They base a team's rating two-thirds on what the team's rating was at the end of the previous season, and one-third on their Transfermarkt squad valuation.

On the other hand, neither Opta (50%-49%) nor Euro Club Index (51%-47%) seem to mark the end and beginning of seasons in such a way. The ECI's methodology notes that more recent results get a higher weighting in the team ratings, but things work on a more continuous basis for the two of them.

I mention this because FiveThirtyEight's system give Arsenal the second-best rating in the Premier League, whereas both Opta and the Euro Club Index put Arsenal fourth-best. Maybe there's something else in the modelling going on, but given that the Gunners last two seasons' saw them finish eighth and fifth, it seems possible that FiveThirtyEight's preseason resets have helped Arsenal climb the rankings quicker than in the other two rating systems.

(This also means that if you wanted to game FiveThirtyEight's system then hacking into Transfermarkt would probably do it. Why you'd want to do that, I don't know. Maybe if you wanted your models to look better than the season prediction that everyone on my Twitter timeline uses).

On top of this, each system updates their ratings slightly differently after every game played, with FiveThirtyEight again more different than the other two.

They all share a similarity: teams get a boost if they perform better than expected in a game, and lose points if they lose a game they should win. However, FiveThirtyEight's ratings use expected goals as a feature while Opta and Euro Club Index's systems are (according to their methodology pages) purely results based. This means that a team could win a game but lose points in FiveThirtyEight's system, if their underlying performance was well below expectation, whereas they'd still gain points in the other two. Maybe this also helped Arsenal whizz up the ratings, if their underlying numbers were more impressive than their results have even been.

Arsenal fans, please put two fingers to the side of your necks. How are your pulses? Better or worse so far? Let's give City fans something to worry about.

For the third time, FiveThirtyEight has a distinctive factor, and this one I can say for sure I'm a fan of. On the webpage, you can hop back to previous points in time and previous seasons with very handy dropdown menus. Last season, Manchester City opened the year with a FiveThirtyEight rating of 92.1 and ended it on 93.5. This season, they opened with a rating of 92.3 and currently sit on 90.6. Number gone down.

This doesn't necessarily mean that FiveThirtyEight is saying 'City are worse than they were at the start of the year'. It could mean that it's taken a while for their rating to accurately reflect City's quality. Either way, number done down. Which means that, even though City's rating is still higher than Arsenal's rating, the gap has been getting smaller and smaller all campaign, and it could get smaller still.

The thing with these predictions is that, like the table itself, they're going to change with each passing gameweek. On the evening of the first of April, things could feel very different. Arsenal will have played Fulham, Crystal Palace, and Leeds; City will have played Crystal Palace and Liverpool. They won't have played a third team in that timespan because of their FA Cup quarter-final on 18 March, a factor that I doubt any of these predictions will be factoring in: other priorities.

Arsenal only have two competitions to think about, the league and the Europa League, and it's pretty clear which of the two they would care most about. City, for now at least, have three, and if they had to only pick one, I bet they'd rather win the Champions League. The ratings that each of these models assign to each team are singular - as far as I can tell from the methodology pages, they don't try and factor in potential squad rotation, which might result in lower-quality teams playing matches (to be fair, this would be a very difficult task to do).

What we see with these models are Arsenal being given an advantage, with a range of one percentage point to ten. But the fact that City still have other distractions means that I, personally, would lean towards the more Arsenal-favourable side of that range.

How confident should Arsenal fans feel about the Premier League? Cautiously. Depending what model you ask.

