Predictive models of the performance of professional football players

  • Mafalda Teixeira Costa da Conceição (Student)

Student thesis: Master's Thesis


The aim of this present study was to predict professional player performance, based on a set of features, including team-related ones and its effect on performance. The predictions were made for two player roles: attackers and midfielders and two distinct independent variables were used: goals total and goals assists. The dataset used corresponded to 4523 players from season 2018-2019, from ten different top European leagues. Some individual performance features were used, like passes accuracy, shots on, duels won were used, as well as some player features like age, height and weight, some club performance features like club market value and club goals total and even popularity features like google search and twitter average likes. The team-related features were calculated by taking the average of a variable for the whole team, excluding the player itself. The results showed that goals_assists_team, goals_total_midfielder_team and market_value_opponents were found to be the most important variables and statistically significant (p-value < 0.05) when predicting goals total. At the same time, goals_assists_team, passes_accuracy_midfielder_team, duels_won_defender_team and market_value_opponents were the most important team-related variables when predicting goals assists and they were all statistically significant (p-value < 0.05). Stochastic Gradient Descent Regressor was the most suitable Machine Learning (ML) model to predict goals total, with RMSE of 1.3543, whereas the Ridge Regression achieved RMSE of 1.054 to predict goals assists. Clubs and players should be aware of these team factors that affect goals and assists, to increase knowledge about the best player-team fit and therefore, improve performance.
Date of Award20 Oct 2021
Original languageEnglish
Awarding Institution
  • Universidade Católica Portuguesa
SupervisorViktor Pekar (Supervisor)


  • Feature importance
  • Feature selection
  • Feature significance
  • Hyperparameter tuning
  • Performance prediction (goals total and goals assists)
  • Supervised machine learning models
  • Team-related variables


  • Mestrado em Gestão e Administração de Empresas

Cite this