Crime inference using machine learning and geographical data

  • Miguel Francisco Frade Roque (Student)

Student thesis: Master's Thesis

Abstract

Crimes are not random events in society, and eventually something must influence their occurrence. It is by characterizing the environment that it is possible to create algorithms that predict the criminal activity in a certain place and at some point in time, which allows its anticipation and prevention through decision-making in public policy.This study focusses on finding the best way to predict crimes, that is, which types of features are the most important to consider while predicting crimes, and which methods are the most predictive.An analysis of the city of Philadelphia, in the state of Pennsylvania (USA), is made, taking into account the urban, racial, demographic and socioeconomic characteristics of its different geographical blocks, and the number of criminal occurrences in each of them, over multiple years. The methods used are both linear and non-linear.When non-linear methods are used, via machine learning techniques, it is evident that the prediction of the number of crimes is much more assertive for any type of variable, leading to the conclusion that the relationships studied here are not linear in nature, and therefore tree based models (especially gradient boosting and random forest) represent the most suitable approach for this data. In this perspective, the models that consider only the socio-demographic characteristics of the neighborhoods are significantly more effective in forecasting than the entirely urban ones.
Date of Award3 Feb 2023
Original languageEnglish
Awarding Institution
  • Universidade Católica Portuguesa
SupervisorNicolò Bertani (Supervisor)

Keywords

  • Crimes
  • Socio-demographic
  • Urban
  • Linear
  • Non-linear

Designation

  • Mestrado em Análise de Dados para Gestão

Cite this

'