Programming Presentation Flashcards
Fantasy Premier League (FPL) has seen huge growth in the past few years, now with over 10 million users signed up
Fantasy Premier League (FPL) is a popular game that now has >10 million participants. Users can set up private leagues with friends and family, giving the game an additional competitive flavour.
Players receive points on FPL based on their contributions in a game, points are awarded for goals, assists, clean sheets etc.
The game is based on actual Premier League matches, where players are awarded points according to their individual contributions and team results.
The use of data is becoming more common for FPL players trying to gain an edge over their private league rivals, this was the basis for my project
The use of data and AI is becoming more common as FPL players try to gain an edge over their opponents. The aim of my project was to analyse data through a series of plots that would aid with the team selection process and help determine which stats are the most important to consider when selecting an FPL team.
I used 3 datasets in my analysis, one of which was used to produce the majority of the visualisations while the other two were used to construct one piece of analysis each
I carried out the bulk of my analysis on a dataset that included relevant FPL stats about every player in the Premier League. I also made use of two other datasets, one included expected goals information about every Premier League player who had made an appearance in each of the last 5 seasons, and the other consisted of fixture difficulty ratings for each teams’ next 10 games.
Everyone starts with a 100m budget and must select 15 players - priced based on their performances in the previous season
FPL players start with a fixed budget with which they must purchase players for their team, priced according to their previous seasons’ form. Thus, the first metric I thought would be useful to consider is value of football players in terms of points per million.
Maximising return on investment leads to maximum points (when full budget is used): Points per Million metric displays best value players
There is a clear advantage to selecting players that give the best return on investment, i.e. the best points:cost ratio. By plotting the top players in terms of this metric, while also excluding players below a certain threshold of total points, the players worth considering for selection are displayed clearly in a simple bar chart.
Players with cheap price tags are favoured, some of the best performers score poorly by this metric because they are so expensive
From the plot, it is clear that players who have a cheap price tag are favoured. While this is useful for indicating value for money, some of the highest point scorers during this season so far, such as Salah and Haaland, score poorly by this metric, because, as historically the best performing players, they are the most expensive.
This metric can’t be used in isolation, total points among other metrics need to be considered as well
So, while maximising points per million is important, the metric can’t be used in isolation, and total points has to be considered as well.
Total points alone discriminates against players who are returning from injury, points per 90 is more informative of how a player scores when they actually play
The problem with total points as a metric is that it doesn’t consider players whose seasons have been affected by injury. This led me to question whether points per 90 (minutes) could be a more informative criterion for selection as it would highlight players who returned well while on the pitch.
The problem with points per 90 is that it can favour players who aren’t consistent starters, so it won’t translate to total points in these cases
A possible problem with this approach is that it could also highlight players who have very limited game time not because of injury, but because of their inability to hold down a starting position for their club. To determine whether points per 90 is an appropriate parameter by which to select a team, I plotted a scatter graph to see if it reflected total points appropriately. From this initial plot, it was clear that players with only limited game time were favoured.
The use of another stat, starts per 90, allows for exclusion of players who don’t start even when fit (edges in green)
To get around this, I made use of another stat - starts per 90 – that was included in my primary dataset. This indicated how often players started the matches in which they played. Using a condition of minimum starts per 90 I was able to identify players who are unlikely to play many minutes even when fit by highlighting their point’s edges in green. Thus the plot helps point out players who play most of the time and who deliver the most points during that game time.
Fixtures can influence players’ performance massively: players who normally wouldn’t be worth considering for selection can be great picks when they have an ‘easy’ run of fixtures
Although points per 90 and points per million are very important, fixtures (i.e. which team the selected player is playing against in any given game week) can influence the scoring of points dramatically. Thus even the players ranked highest by the above metrics often perform worse when facing tougher opponents
Fixtures are ranked from 1-5 using a model, with 1 being the easiest and 5 being the hardest
Fixtures are ranked from 1-5, with 1 being the easiest and 5 being the hardest. A colour coordinated layered bar chart nicely displays fixture difficulties over the next 3, 6 and 10 game weeks, and helps to make informed choices about the best teams to select players from (at any point during the season).
The graph shows that Brighton (BHA) have a good run of fixtures over short and medium term
For example, from the graph it can be seen that Brighton have a run of easier games approaching and so Brighton players should be prioritised for selection in the short term.
Expected goals (xG) is a metric that measures the probability that a specific chance will be converted, with 0 being no chance of scoring and 1 being a guaranteed goal
Amongst the emerging metrics from football-related data is Expected goals (or xG), which is now widely used. It effectively measures the probability that a given chance will be scored, returning a decimal between 0 and 1.xG is used cumulatively over the course of the season and thus reflects the quality of chances a player is getting.