Home
Scholarly Works
Increasing the predictive accuracy of the...
Preprint

Increasing the predictive accuracy of the Resistance Gene Identifier by evaluating antimicrobial resistance gene over- and underprediction

Abstract

ABSTRACT Computational biology is paving the way towards accessible antimicrobial resistance (AMR) gene (ARG) detection methods to complement canonical gold-standard phenotypic diagnostics for the purposes of antimicrobial surveillance. However, obtaining an accurate depiction of phenotypic resistance through genotypic methods requires addressing potential over- and underprediction of ARGs. This study assessed the manual curation accuracy of bit-score cutoff values associated with bioinformatic models within the Comprehensive Antibiotic Resistance Database (CARD) and the subsequent effects of erroneous cutoff curation, leading to potential Type I and Type II error, on its in silico resistome prediction tool, the Resistance Gene Identifier (RGI). CARD models rarely overpredicted (5 of 3,900 models with >5% false positive rates) but somewhat underpredicted (739 of 3,900 models with >5% false negative rates) resistance-associated sequences and mutations, emphasizing RGI’s conservative prediction algorithms. Isolating curation inaccuracies by AMR gene family, efflux-related families were the main contributors to overprediction (likely due to human curation error), while underprediction was primarily due to beta-lactamase families, the latter finding highlighting systemic curation deficiencies.

Authors

Mukiri KM; Alcock BP; Raphenya AR; McArthur AG

Publication date

December 15, 2025

DOI

10.64898/2025.12.11.693720

Preprint server

bioRxiv
View published work (Non-McMaster Users)

Contact the Experts team