What is Fantasy Premier League?
Fantasy Premier League is a game that casts you in the role of a Fantasy manager. You are given the task to pick a squad of real-life players who score points for your team based on their performances in their own matches.
FPL hands you a £100m budget which you spend on your squad. You will find that the most expensive players will be the attacking options in midfield and forward positions, while budget options will be inexperienced youngsters or more modest scoring players. Your squad will have 15 players in it but only 11 can start every week, it’s up to you to decide which four to leave on the bench. And it’s also up to you to decide which players to transfer in each week. Your squad can only have a maximum of three players from one team.

The Dilemma
A regular occurrence each year is that you may join Fantasy PL with ambitions of seeing the season through and securing a top-half finish. And then by about November, you forget about it
This made me wonder however – what if it is possible to select a team on Gameweek 1, and simply forget about it for the rest of the season while still getting a strong finish. Using data from the 2020/21 Premier League season, I set about to find the answer.
Data Collection and Cleaning
My first step involved finding an API that included the relevant player data. After inspecting the page source of the Fantasy PL website, I discovered an API link that looked promising. Using a Python request, I could then import this data into a CSV file. The file has 432 rows of data and 19 columns. Making it a rather extensive database.

Rerunning the file through Python, I then began cleaning the player data. The import had caused some formatting issues with accented characters in the players names. Such as Héctor Bellerín who’s name has now changed to “H’©ctor Beller’_n”. I then made sure to add a new column “points_per_mil”, which can be simply calculated by dividing the players total_points by their market price.
It was now time to address the anomaly's caused by 'out-of-position' players. It is a regular occurrence that a player may be assigned the incorrect position in FPL. As an example I will use Stuart Dallas. At 171 points for the price of £5.5, Dallas appears to be a huge bargain. However, a close inspection reveals that he was mistakingly labelled as a defender last season, while his actual playing position is midfield. And you may be wondering what does this mean in terms of his points scored exactly?
Well, defenders get 4 points for a clean sheet while midfielders only receive 1. In addition, midfielders also receive 1 less point for scoring goals. Having scored 8 goals (-1 point) and receiving 12 clean sheets (-3 points), this heavily reduces his points total. 171-36-8=127
And now we’re set to use the data.
Data Visualisations
Before setting ahead with the main task, I entered the CSV file into Tableau to create some data visualisations.
The following data made for some interesting insights. In this scatterplot, we can see that midfield and forward players are the highest scoring and the most expensive.
In this graph we can see the 12 most selected players for the new season so far. Salah and Fernandes remain the most selected, due to their high-scoring season. Meanwhile, Harry Kane remains at just below 29% despite being the second highest scoring player (possibly due to reports of a possible transfer to Man City). Overall, the graph indicates that the highest scoring players are generally the most selected. We can also see that Luke Shaw is highly selected, despite having a lower points total (which may be attributed to him having an excellent run of form towards the end of last season). It is important to note that team selections can be heavily influenced by sensational news articles or discussions from online communitie. For instance at the time of writing this, the Reddit Fantasy FPL page has over 450,000 active members
However, Ivan Toney is the real outlier in this graph. Despite this being his first Premier League season, the striker from newly promoted Brentford has been selected by over 35% of FPL users (possibly due to his ecstatic performances in the lower league and rather low price).
We can also gain many other interesting insights from Tableau. In this scatter plot we can make the observation that more aggressively players tend to score more goals. Son appears in the top scorer chat despite being the least attacking. His former teammate Harry Kane leads in both attacking and goal-scoring stats. This would likely suggest that Son performs better due to Kanes attacking threat and high workrate. Another option Fantasy PL Managers may want to consider if Kane does transfer to Man City.
Investigating with SQL
Loading the data into SQL allows us to make some quick observations of the data. However, it also serves to show the limitations of only using SQL for this project.

Entering in the query, we are returned a list of players with the best possible “points per million” rating. Adding a WHERE clause, we can also filter out players that don’t play above a threshold of minutes. And we can also use filters via the WHERE clause to view by position.
Our suggested team is balanced all-round with a reliable bench, however it also has 4 players each from both Leeds United and Aston Villa (suggesting that these teams have some undervalued players). We also have £6 million left in the budget (which could be better used transferring in a premium player like Salah or Fernandes). Interestingly, a points deducted Dallas still makes the cut for this value-for-money squad.
Creating the Python Algorithm
And now, it’s time to create our algorithm. For this, we will be using a plugin called PuLP. The players are essentially variables. I have created a dictionary that contains all of the players and a value of 0 if they are not selected in the team, and a value of 1 if they are selected. I done this using the pulp.LpVariable.dict() function.

I then list out the constraints. There aren’t too many to worry about here:
Okay, let’s see if we have a team!
And now ‘Algorithm United’ has signed 11 players. Let’s see who made the cut. When PuLP solved the problem, it changed the values in the dictionary to show the starting 11 encoded as 1s. The above loop looks at each entry to our dictionary and prints out those who made the cut.
Entering the players into Fantasy PL, and this is the squad we get! As Fernandes is the highest scoring player in the squad, he receives the captains armband which gives him double points. The squad costs a total of £83 million, which leaves a budget of £17 million for the subs (which is just enough to purchase the four cheapest players available on Fantasy PL.
Despite their impressive scores from last season, Ruben Dias, Mohammad Salah and Jamie Vardy are amongst some of the top players not to make the squad! So, I have just built a Fantasy PL team with an algorithm. The team has a total points tally of 2,248 points. While the world champion of the 20/21 season received 2680.
Would I use this squad as my own team? - Probably not... While it does build a very well balanced team to "set and forget", it also ignores factors such as how newly promoted teams will perform, or possible injuries. What the case study does succeed in however is highlighting undervalued players. History often does repeat itself in the football world, however part of the fun of FPL is going against the grain and having the next best players in your squad a week before the competition.
I will provide an updated post on the success of this team, in which I will compare it to my own personal selection and contrast the two.