Predicting hourly demand for shared bicycles with weather data and machine learning models

  • Dijora Peja (Student)

Student thesis: Master's Thesis

Abstract

This thesis aims to analyze the bike sharing system in Chicago and apply predictive models that accurately predict the hourly demand for shared bicycles by using time­related and weather­related features. The dependent variable is Count, expressing the sum of the number of bicycles used per hour. Predictive models that are used for this regression problem are Linear Regression, Random Forest, Gradient Boosting, Light Gradient Boosting, Extreme Gradient Boosting, and Multi­Layer Perceptron. Accuracies of these predictive models are measured by R2_score, Root Mean Square Error and Mean Absolute Error. For better predictions, different hyperparameters are used inpredictive models.Without hyperparameters, Random Forest achieves the best accuracy measures. However, after using hyperparameters, Gradient Boosting predicts the most accurate results. The accuracy of Gradient Boosting boosts with hyperparameters, whereas Random Forest is almost unaffected by them for this regression problem. The second­best model when using hyperparameters is Extreme Gradient Boosting. The neural network model, Multi­Layer Perceptron presents less accurate results than the Random Forest and the Boosting models for this type of problem.Features that are most important for predictive models to forecast accurately were Temperature, Hour, Weekend, Pressure, Uv_Index, and Day.
Date of Award3 Feb 2023
Original languageEnglish
Awarding Institution
  • Universidade Católica Portuguesa
SupervisorNicolò Bertani (Supervisor)

Keywords

  • Demand forecasting
  • Shared bikes
  • Weather data
  • Machine learning
  • Predictive models

Designation

  • Mestrado em Análise de Dados para Gestão

Cite this

'