A Deep Learning Method to Automatically Identify Reports of Scientifically Rigorous Clinical Research from the Biomedical Literature: Comparative Analytic Study

abstract

BACKGROUND: A major barrier to the practice of evidence-based medicine is efficiently finding scientifically sound studies on a given clinical topic. OBJECTIVE: To investigate a deep learning approach to retrieve scientifically sound treatment studies from the biomedical literature. METHODS: We trained a Convolutional Neural Network using a noisy dataset of 403,216 PubMed citations with title and abstract as features. The deep learning model was compared with state-of-the-art search filters, such as PubMed's Clinical Query Broad treatment filter, McMaster's textword search strategy (no Medical Subject Heading, MeSH, terms), and Clinical Query Balanced treatment filter. A previously annotated dataset (Clinical Hedges) was used as the gold standard. RESULTS: The deep learning model obtained significantly lower recall than the Clinical Queries Broad treatment filter (96.9% vs 98.4%; P<.001); and equivalent recall to McMaster's textword search (96.9% vs 97.1%; P=.57) and Clinical Queries Balanced filter (96.9% vs 97.0%; P=.63). Deep learning obtained significantly higher precision than the Clinical Queries Broad filter (34.6% vs 22.4%; P<.001) and McMaster's textword search (34.6% vs 11.8%; P<.001), but was significantly lower than the Clinical Queries Balanced filter (34.6% vs 40.9%; P<.001). CONCLUSIONS: Deep learning performed well compared to state-of-the-art search filters, especially when citations were not indexed. Unlike previous machine learning approaches, the proposed deep learning model does not require feature engineering, or time-sensitive or proprietary features, such as MeSH terms and bibliometrics. Deep learning is a promising approach to identifying reports of scientifically rigorous clinical research. Further work is needed to optimize the deep learning model and to assess generalizability to other areas, such as diagnosis, etiology, and prognosis.

authors

Del Fiol, Guilherme
Michelson, Matthew
Iorio, Alfonso
Cotoi, Chris
Haynes, Robert Brian

publication date

June 25, 2018

has subject area

08 Information and Computing Sciences (FoR)
11 Medical and Health Sciences (FoR)
17 Psychology and Cognitive Sciences (FoR)
Deep Learning (MeSH)
Humans (MeSH)
Information Storage and Retrieval (MeSH)
Medical Informatics (Science Metrix)
Neural Networks, Computer (MeSH)
PubMed (MeSH)

published in

Journal of Medical Internet Research Journal

A Deep Learning Method to Automatically Identify Reports of Scientifically Rigorous Clinical Research from the Biomedical Literature: Comparative Analytic Study Journal Articles

Overview

abstract

authors

publication date

has subject area

published in

Research

keywords

Identity

Digital Object Identifier (DOI)

Additional Document Info

start page

end page

volume

issue