Customer churn prediction is a critical task for businesses operating in competitive markets, especially in the context of online retail. Identifying customers at risk of leaving a service or product allows businesses to implement proactive retention strategies and maintain long-term profitability. This thesis aims to investigate the factors influencing customer churn in online retail and develop predictive models to anticipate churn behavior. Leveraging machine learning techniques, interpretability, and explainability, this study explores the impact of various customer attributes such as demographic information, purchasing behavior, and satisfaction scores on churn prediction. The analysis uses a comprehensive dataset containing customer attributes, transaction history, and response to marketing campaigns. By employing logistic regression models, gradient boosting models and advanced interpretability techniques such as SHAP (SHapley Additive exPlanations), this research aims to provide actionable insights for businesses to mitigate churn and enhance customer retention strategies in the online retail landscape. The findings highlight the significance of features such as average transaction amount, annual income, and recency of last purchase in predicting customer churn, and demonstrate the superior performance of gradient boosting models over logistic regression models in this context.
- Churn prediction
- Logistic regression
- Gradient boosting
- Interpretability & explainability
- SHAP
- Mestrado em Análise de Dados para Gestão
Customer churn prediction
Fumo, D. (Student). 24 Jun 2024
Student thesis: Master's Thesis