Home
Scholarly Works
Evaluation of artificial intelligence-based tool...
Journal article

Evaluation of artificial intelligence-based tool Covidence in literature screening for guideline updates: a prospective study

Abstract

Background Regularly updating literature evidence for clinical practice guidelines (CPGs) is necessary but time intensive. Artificial intelligence (AI) tools, like Covidence, may accelerate literature screening. This study prospectively evaluated the effectiveness and accuracy of two Covidence functions: the auto-marking filter (adopted from Cochrane’s RCT classifier) for identifying randomized controlled trials (RCTs) and the machine learning-assisted title and abstract (Stage I) screening in two cancer-related CPGs. Methods We updated the literature searches for the breast and lung cancer CPGs at the Program in Evidence-Based Care (PEBC), Ontario Health (Cancer Care Ontario). Two methodologists trained the machine learning model by indicating which of the first 25 references were tagged by Covidence’s RCT filter. Main outcomes included workload and time savings, Work Saved over Sampling (WSS), and the impact on whether original guideline recommendations required changes. Results A total of 1,270 (breast cancer) and 1,734 (lung cancer) references were imported into Covidence. Among them, 633 breast cancer and 656 lung cancer references were tagged by Covidence’s RCT filter as possible RCTs and required review at Stage I. Covidence’s RCT filter excluded non-RCTs with high accuracy, reducing manual screening by 50.2% (breast cancer) and 62.2% (lung cancer). For Stage I screening, Covidence reduced workload by 48.3% and 55.2% at 95% sensitivity, and by 44.2% and 16.9% at 100% sensitivity, respectively. WSS values aligned with these reductions. Due to the small number of references screened and the time needed to become initially trained to use Covidence, time savings was minimal. At 95% sensitivity, one included reference per CPG was missed, but guideline recommendations remained unchanged. Conclusion Covidence AI-assisted screening may effectively support updating literature reviews for oncology CPGs by reducing workload without compromising the integrity of final recommendations. A 95% detection threshold may provide a practical balance between efficiency and accuracy.

Authors

Yao X; Low A; Sivajohanathan D; Vella E; Wang P; Kaur G; Saha A; Sussman J

Journal

Intelligent Medicine, , ,

Publisher

Elsevier

Publication Date

December 1, 2025

DOI

10.1016/j.imed.2025.12.006

ISSN

2096-9376

Contact the Experts team