Written by Matt Stevenson
We have all probably held our heads in our hands as an AFC Bournemouth player misses what appears to be a ‘sitter’. The flip side is when our player smashes the ball into the net making a difficult chance look routine. Historically, there was little data that could be objectively used to determine whether (1) a player was more likely to be ice-cool or go to bits in the box and (2) whether a forward who scored 20 goals for the team winning the championship was more clinical than a forward who scored 10 for a team fighting relegation. However, in the last, five or so years expected goals (or xG) has become widespread. xG tries to measure how often an ‘average’ player would score from a certain position against an ‘average’ goalkeeper, which might help in answering the two questions above.
An explainer video is here: WHAT IS EXPECTED GOALS? USE xG TO INCREASE PROFITS IN FOOTBALL BETTING – YouTube but a quick summary is that based on a large database from professional leagues a prediction is made on the likelihood of a goal, from 1% (next to no chance) to 99% (you’d almost have to be trying to miss). This prediction is based on: how far away the shot is; how many bodies are in the way?; was it a set piece?; was it a shot or a header?; and whether it was with the favoured foot? No statistical model is perfect, but xG may be a good way of determining whether a striker is above or below average when it comes to finishing (outscore the xG by a margin and you are clinical, fall below the xG by a margin and you are wasteful). It can also be used to determine how good a goalkeeper you have too using similar logic in terms of actual goals conceded and predicted xG conceded.
xG timelines have been posted on the UpTheCherries Forum to give an additional view of how games have panned out and have been met with interest or criticism from those who express a view. One of the criticisms relates to the concept of the ‘average’ player, as Lewandowski would surely convert more identical chances than the journeyman striker who scores once a month. This seems logical, so I looked to see how much variability there was between xGs as rated for each striker and their actual goal returns for the top 20 scorers in the championship last season (as listed on ‘https://www.infogol.net/en/leagues/english-football-league-championship-top-goalscorers-2020-21/264). For these 20 players, goal returns ranged from 33 (Toney) to 9 (Dike). For interest, I separately looked at AFCB players, and selected other players, in the full list of 80 players, and also looked at how some are doing this half-season.
For my analysis, I plot a linear regression through the data points but ensure that it goes through the origin as you cannot score without having an xG. The analysis showed that zero was within its confidence interval for the intercept so I’m happy with this simplification. The fit to the data is pretty good, (see Figure 1) with an adjusted R-squared of 0.91, meaning that 91% of the variation in Goals scored can be explained by the xG statistic. For real-world data this is very high. The line on the graph shows the best prediction line, with those over the line beating their xG whereas those below the line do not reach it. We can see that former AFC Bournemouth attacker Arnault Danjuma did very well, which will come as no surprise to Cherries fans who watched him during the previous season. Whereas Collins at Cardiff underperformed. Two other strikers with AFCB connections were close to their XGs, with Jamal Lowe bettering it, whereas Dominic Solanke’s goal output was less. The best-fitting relationship was that 1 xG should result in 1.06 goals but there is likely to be selection bias from picking the top 20 scorers as people such as Andre Gray (5 goals from an xG of 9.56) aren’t included so I’m happy using a 1:1 relationship for further analyses.
Figure 1: Goals scored vs xG
Assuming that xG was the perfect measure (yes, I know!) two other measures could be informative.
- The absolute difference between goals and xG although this will be dependent on the absolute xG value (Figure 2) and
- Goals scored divided by xG (all minus 1) so 0 becomes the point at which the two values are the same (Figure 3). This removes the impact of the absolute xG.
From these plots, Danjuma is a clear outlier, and hence why he is now playing in the Champions League, although Dike (then at Barnsley and now a new signing for West Brom) also did well. Collins in particular had a poor return, whereas differences in the others could potentially be down to inaccuracy and limitations in the xG statistic, or potentially the keeper they faced on certain days.
That is weird, as I could see them too. Here there are.
Figure 2: Difference in Goals scored and xG
Figure 3: (Goals scored divided by xG) -1
Other players that caught my eye from last season naturally included AFCB players: Stanislas 9 goals from 9.15 xG, Billing 8 goals from 4.04 xG, Brooks, 5 goals from 5.15 xG and strikers that are currently playing at our promotion rivals or who have been discussed on the forum: Brereton-Diaz 7 goals from 6.47 xG, and Grabban 6 goals from an xG of 9.95.
In terms of the current season selection bias could be a big issue as the person in 20th position (Harry Wilson) has only scored 6 goals and additionally, you will always get better predictions from longer-term data. I’ll provide a brief summary of possible interpretations of that data here. There are, of course, three runaway contenders for the top goalscorer: Mitrovic 22 goals from 17.12xG, Brereton-Diaz 20 goals from 17.20 xG and Solanke 18 goals from 18.42 xG. Other players who are doing particularly well are Piroe at Swansea (11 goals in 6.65 xG), Paterson at Swansea (8 goals in 3.67 xG) and Philip Billing (7 goals in 4.77). Given the Dane’s output last year, this has made me reappraise how clinical he is, showing that the missed header at Derby really appears to be a blip and that if we had one person through on goal then Billing may be a good choice. Lewis Grabban, who has been reported on the forum as having the least shots per goal is matching his xG (10 goals from 10.04 xG). Andre Gray, who was poor last season appears to have improved with 4 goals in 2.51 xG. Other AFCB players are Jaidon Anthony (6 goals in 4.83) and his house-mate Jordan Zemura (3 goals in 1.67 xG)
This analysis has shown that the difference in the ability of the striker is much less than I originally thought it would be, bar the exception of Danjuma, who we all knew was too good for the Championship. This is far from a definitive analysis, but may make others question whether there really is a large difference in the clinical nature of finishing in the Championship, with initial stats showing that more than 90% of goals scored can be explained by the xG stat rather than the ability of the players. Other attributes, such as work rate, link-up play, assists, and creating space for teammates, may all be valued higher than a marginal gain in finishing which is why a substantial proportion of AFCB fans (I suspect) wouldn’t switch Solanke for Mitrovic.
Sorry Roger wrote…
Thanks for posting the xG data and this timeline in particular Matt. The data sometimes repeats the obvious and can be misleading but I see it as an interesting alternative dimension to thinking about each game. – Join the conversation, click here.