From black box to transparent academic support

  • Daniela Valente (Student)

Student thesis: Master's Thesis


The thesis explores the evolution of education in Portugal, from high illiteracy rates to alignment with European averages in access to higher education. Despite progress, persistent school dropout remains a challenge. Using machine learning models emphasising explainability, the aim is to predict academic performance and identify dropout risks, highlighting the importance of transparency to build confidence in making practical decisions, such as preventative support strategies. The study uses public data from a Portuguese university, exploring demographic, socioeconomic, macroeconomic and academic factors. Two models were developed, Model A (after one academic year) and Model B (at enrolment), using the CatBoost algorithm. The results indicated substantially better performance for Model A, but both face challenges in the confusion matrix, with more false positives than false negatives. Predicting a false positive is more costly than predicting a false negative, according to the aim of the analysis. To solve this problem, an individualised analysis adapted to each model is suggested. The interpretability technique results highlight that after one year, first year grades have a significant impact on student performance, while at the time of enrolment, age, holding a scholarship and gender also emerged as influential factors. The significance of this analysis aims to formulate proactive strategies and personalised support systems to mitigate dropout risks and increase success in Portuguese higher education.
Date of Award30 Jan 2024
Original languageEnglish
Awarding Institution
  • Universidade Católica Portuguesa
SupervisorAna Marisa Mendes Gonçalves Vinhais Guedes (Supervisor)


  • Machine learning
  • Explainability
  • Education
  • CatBoost
  • Academic support


  • Mestrado em Análise de Dados para Gestão

Cite this