Machine learning used to study risk factors for chronic diseases: A scoping review.
Journal Articles
Overview
Research
Identity
View All
Overview
abstract
OBJECTIVES: Machine learning (ML) has received significant attention for its potential to process and learn from vast amounts of data. Our aim was to perform a scoping review to identify studies that used ML to study risk factors for chronic diseases at a population level, notably those that incorporated methods to mitigate algorithmic bias. We focused on ML applications for the most common risk factors for chronic disease: tobacco use, alcohol use, unhealthy eating, physical activity, and psychological stress. METHODS: We searched the peer-reviewed, indexed literature using Medline (Ovid), Embase (Ovid), Cochrane Central Register of Controlled Trials and Cochrane Database of Systematic Reviews (Ovid), Scopus, ACM Digital Library, INSPEC, and Web of Science's Science Citation Index, Social Sciences Citation Index, and Emerging Sources Citation Index. Among the included studies, we examined whether bias was considered and identified strategies employed to mitigate bias. SYNTHESIS: The search identified 10,329 studies, and 20 met our inclusion criteria. The studies we identified used ML for a wide range of goals, from prediction of chronic disease development to automating the classification of data to identifying new associations between risk factors and disease. Nine studies (45%) included some discussion of algorithmic bias. Studies that incorporated a broad array of sociodemographic variables did so primarily to improve the performance of a ML model rather than to mitigate potential harms to populations made vulnerable by social and economic policies. CONCLUSION: This work contributes to our understanding of how ML can be used to advance population and public health.