Skip to main navigation Skip to search Skip to main content

Predicting early NBA career survivability using pre-draft college statistics

  • Christian Ferdinand Wiethüchter (Student)

Student thesis: Master's Thesis

Abstract

This study examines whether publicly available college basketball statistics can predict early NBA career outcomes, a period both formative for players and financially crucial for franchises under rookie-scale contracts. A dataset was assembled covering all drafted college players from 2002–2019, combining season-level college performance with subsequent NBA results. Because most prospects play multiple seasons before declaring for the draft, season-level predictions were systematically aggregated to the player level to reflect the actual unit of draft-day decision-making. Five outcome labels captured increasing levels of early-career success: surviving the rookie contract, securing rotation or starter roles, and surpassing minimum impact thresholds in Win Shares and Value Over Replacement Player. An end-to-end pipeline embedded preprocessing, grouped cross-validation, and fold-safe aggregation, ensuring reproducibility and preventing information leakage. Regularized linear models served as baselines, while Random Forests and Gradient Boosted Trees benchmarked non-linear performance. The results show that college box-score data contains predictive signal, though modest in strength for most labels. Tree-based methods outperformed linear models: Random Forests were strongest for durability-oriented outcomes such as rotation roles and four-year survival, while boosting captured rarer ceiling outcomes like starters and high-impact contributors. Aggregation to the player level proved essential, with simple averaging often sufficient. Feature importance highlighted class year, games played, assists, and shooting efficiency as consistent though limited predictors. While box scores alone cannot identify future stars with high precision, they provide a systematic, reproducible baseline that helps reduce draft risk by flagging players most likely to contribute early.
Date of Award14 Oct 2025
Original languageEnglish
Awarding Institution
  • Universidade Católica Portuguesa
SupervisorNicolò Bertani (Supervisor)

UN SDGs

This student thesis contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 8 - Decent Work and Economic Growth
    SDG 8 Decent Work and Economic Growth
  2. SDG 9 - Industry, Innovation, and Infrastructure
    SDG 9 Industry, Innovation, and Infrastructure
  3. SDG 12 - Responsible Consumption and Production
    SDG 12 Responsible Consumption and Production

Keywords

  • NBA draft
  • College basketball statistics
  • Machine learning
  • Sports analytics
  • Player aggregation
  • Early career prediction

Designation

  • Mestrado em Análise de Dados para Gestão

Cite this

'