Calibrate your own Euro 2016 Predictor with Tableau
The European Championship is starting this friday and is on everyones lips. The big question is who is going to win?
Is it gonna be England who managed to get 30 points in the qualification round?Is it gonna be France who have the advantage of playing in their home country?Is is gonna be Sweden who will surprise and give Zlatan his first big cup title? or maybe simply Germany who has shown good results at both the European Championships (EC) and is the team who has participated the most years.
Go to the bottom of the post to calibrate you own predictor. Continue reading to learn how I have build it using R-studio, Alteryx and Tableau.
I am playing an online football manager game related to the Euro 2016 and I am battling my friends in this game, and since I am a bit of a data nerd, I decided to let data help me set the best team. Thus, I wanted to know who is going to be the winning team of Euro 2016.
First, I sat down with pen and paper and developed the initial thoughts.
1) I needed some data:Scraping with R, Preparing with Alteryx and Visualizing with Tableau.
2) I needed some important factors that might influence a who will win.
Following factors was chosen:
- Fifa ranking
- Top 4 positions in EC or World Cup (WC) since 2002
- EC points since 2004
- WC points since 2002
- Number of previous appearances in EC
- Home team or not
- Points in qualification
- Zlatan Ibrahimović is on the team (of course - everone knows the Zlatan-effect)
- Randomness factor
With R-Studio I scraped the data from the following sites:
I outputted the data to Alteryx, made som joining and small transformations and ended out with a small dataset which I outputted as a Tableau Data Extract.
My Alteryx workflow is shown below:
Explainations of the factors
All the factors are potentially equal. They have all been indexed such that the highest score is 1 and the other teams scores are related to the highest score (Score_Team_i / Score_BestTeam.)
Fifa ranking:-1 x The position on the fifa (Belgium is best (Number 2))
Top 4 positions in EC or World Cup (WC) since 2002:A simple count of the times a team has made it to the semifinals (Germany (5 times))
EC points since 2004:The amount of points in EC group stages since 2004 (Spain (20 Points))
WC points since 2002:The amount of points in WC group stages since 2002 (Germany (31 Points))
Number of previous appearances in EC:As stated on wikipedia (Germany (11))
Home team or not:France = 1, everyone else = 0
Points in qualification:Points in group part of qualification (France is given 25).Qualification Group I only had 8 games, whereas others had 10. Therefore the qualification points from teams from Qualification Group I has been multiplied by 10/8=1.25.(England (30 Points))
Zlatan Ibrahimović is on the team (Because he is Zlatan):Sweden = 1, everyone else = 0
Randomness factor (Because everything can happen in football):Randomly assigned number between 0 and 1 (Rand() function in Alteryx)
Making the prediction
I have seen a couple of football predictors before. However, I am often annoyed that they are putting too much weight on a certain factor. I would like to make a prediction based on what I find important and allow other people to do the same.
Therefore, I have used parameters in Tableau to allow each user to assign a weight between 0 and 100 to each of the presented factors.
Unfortunately, by making the predictor this dynamic, I have not been able to come up with an algorithm for calculating the outcome of all games. The workbook will simply give you the overall position off all according to the weights provided. And the winner and runner-up in every group.
I believe that the home team has an advantage and that previous EC performance matters and therefore I give those two factors a heavy weight, which leads to France as my winner, which one is yours?
Try the Euro 2016 Predictor here: