Predicting Results and Goals with Machine Learning

I have created a model to predict the outcome of the FIFA World Cup games

Gustavo Santos


Photo by Thomas Serer on Unsplash

We are now one week deep into the FIFA World Cup 2022. More than 20 games have already occurred with some pretty “predictable” results, as well as some surprises, what is somewhat not uncommon in football (or soccer, whatever you want to call it).

A couple of days before the championship’s kick off, I started to see some posts with these nice models from other data scientists trying to predict the outcome of the games and, of course, predict the champion. After all, the World Cup is one of the most desired trophies in this sport.

For me, as a football lover (I am Brazilian, I call it football because the game is played with the feet… :-) ), I got double interested: (1) for the challenge of creating a model and practicing data science skills; (2) to work with such an interesting topic.

So, I went ahead and created a classification model to predict the result of the game (win, draw, lose). But I also wanted to be able to know the final score of the game, so I could play with my friends in those small betting competition to see who gets more results right. Therefore, I also created another model to predict that.

Also, let’s agree that football is a sport strongly reliant on the human factor, what makes it much more difficult to model. You will see that even with really good algorithms, the accuracy of the model is never too high. It is more like an informed guess, I’d say.

In this post, I will explain the approach I took for each step.

Let’s go.

Modeling the game result

Image by the author.

The first part of the model was to predict the outcome of the game. Thus, I needed to create a classifier to predict one out of three classes: win, draw or lose. I know there are different ways to create such a model. You can use history of games, players data, coaches…



Gustavo Santos

Data Scientist. I extract insights from data to help people and companies to make better and data driven decisions. | In: