Abstract
This study examines whether publicly available college basketball statistics can predict early NBA career outcomes, a period both formative for players and financially crucial for franchises under rookie-scale contracts. A dataset was assembled covering all drafted college players from 2002–2019, combining season-level college performance with subsequent NBA results. Because most prospects play multiple seasons before declaring for the draft, season-level predictions were systematically aggregated to the player level to reflect the actual unit of draft-day decision-making. Five outcome labels captured increasing levels of early-career success: surviving the rookie contract, securing rotation or starter roles, and surpassing minimum impact thresholds in Win Shares and Value Over Replacement Player. An end-to-end pipeline embedded preprocessing, grouped cross-validation, and fold-safe aggregation, ensuring reproducibility and preventing information leakage. Regularized linear models served as baselines, while Random Forests and Gradient Boosted Trees benchmarked non-linear performance. The results show that college box-score data contains predictive signal, though modest in strength for most labels. Tree-based methods outperformed linear models: Random Forests were strongest for durability-oriented outcomes such as rotation roles and four-year survival, while boosting captured rarer ceiling outcomes like starters and high-impact contributors. Aggregation to the player level proved essential, with simple averaging often sufficient. Feature importance highlighted class year, games played, assists, and shooting efficiency as consistent though limited predictors. While box scores alone cannot identify future stars with high precision, they provide a systematic, reproducible baseline that helps reduce draft risk by flagging players most likely to contribute early.| Date of Award | 14 Oct 2025 |
|---|---|
| Original language | English |
| Awarding Institution |
|
| Supervisor | Nicolò Bertani (Supervisor) |
UN SDGs
This student thesis contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 8 Decent Work and Economic Growth
-
SDG 9 Industry, Innovation, and Infrastructure
-
SDG 12 Responsible Consumption and Production
Keywords
- NBA draft
- College basketball statistics
- Machine learning
- Sports analytics
- Player aggregation
- Early career prediction
Designation
- Mestrado em Análise de Dados para Gestão
Cite this
- Standard