Validation of a natural language processing algorithm to identify adenomas and measure adenoma detection rates across a health system: a population-level study
Journal Articles
Overview
Research
Identity
Additional Document Info
View All
Overview
abstract
BACKGROUND AND AIMS: Measuring adenoma detection rates (ADRs) at the population level is challenging because pathology reports are often reported in an unstructured format; further, there is significant variation in reporting methods across institutions. Natural language processing (NLP) can be used to extract relevant information from text-based records. We aimed to develop and validate an NLP algorithm to identify colorectal adenomas that could be used to report ADR at the population level in Ontario, Canada. METHODS: The sampling frame included pathology reports from all colonoscopies performed in Ontario in 2015 and 2016. Two random samples of 450 and 1000 reports were selected as the training and validation sets, respectively. Expert clinicians reviewed and classified reports as adenoma or other. The training set was used to develop an NLP algorithm (to identify adenomas) that was evaluated using the validation set. The NLP algorithm test characteristics were calculated using expert review as the reference. We used the algorithm to measure ADR for all endoscopists in Ontario in 2019. RESULTS: The 1450 pathology reports were derived from 62 laboratories, 266 pathologists, and 532 endoscopists. In the training set, the NLP algorithm for any adenoma had a sensitivity of 99.60% (95% confidence interval (CI), 97.77-99.99), specificity of 99.01% (95% CI, 96.49-99.88), positive predictive value of 99.19% (95% CI, 97.12-99.90), and F1 score of .99. Similar results were obtained for the validation set. The median ADR was 33% (interquartile range, 26%-40%). CONCLUSIONS: When we used a population-based sample from Ontario, our NLP algorithm was highly accurate and was used at the system level to measure ADR.