Predicting offenses among individuals with psychiatric disorders - A machine learning approach
Additional Document Info
BACKGROUND: Actuarial risk estimates are considered the gold-standard way to assess whether psychiatric patients are likely to commit prospective criminal offenses. However, these risk estimates cannot individually predict the type of criminal offense a patient will subsequently commit, and often simply assess the general likelihood of crime occurring in a group sample. In order to advance the predictive utility of risk assessments, better statistical strategies are required. AIM: To develop a machine learning model to predict the type of criminal offense committed in a large transdiagnostic sample of psychiatry patients, at an individual level. METHOD: Machine learning algorithms (Random Forest, Elastic Net, SVM), were applied to a representative and diverse sample of 1240 patients in the forensic mental health system. Clinical, historical, and sociodemographic variables were considered as potential predictors and assessed in a data-driven way. Separate models were created for each type of criminal offense, and feature selection methods were used to improve the interpretability and generalizability of our findings. RESULTS: Sexual offenses can be predicted from nonviolent and violent offenses at an individual level with a sensitivity of 82.44% and specificity of 60.00%, using only 36 variables. Furthermore, in a binary classification model, sexual and violent offenses can be predicted at an individual level with 83.26% sensitivity and 77.42% specificity using only 20 clinical variables. Likewise, non-violent and sexual offenses can be individually predicted with 74.60% sensitivity and 80.65% specificity using 30 clinical variables. CONCLUSION: The current results suggest that machine learning models can show greater accuracy than gold-standard risk assessment tools (AUCs 0.70-0.80). However, unlike existing risk tools, this approach allows for the prediction of cases at an individual level, which is more clinically useful. Despite this, it is important to note that a large subset of patients in the sample were involved in the criminal system in the past, prior to an official diagnosis. Therefore, many of the variables that predict offenses may be derived from the issues of prior offenses. Irrespective of this, the accuracy of prospective models is expected to only improve with further refinement.