GRADE guidance 37: rating imprecision in a body of...

GRADE guidance 37: rating imprecision in a body of evidence on test accuracy

Abstract

OBJECTIVES: To provide guidance on rating imprecision in a body of evidence assessing the accuracy of a single test. This guide will clarify when Grading of Recommendations Assessment, Development and Evaluation (GRADE) users should consider rating down the certainty of evidence by one or more levels for imprecision in test accuracy. STUDY DESIGN AND SETTING: A project group within the GRADE working group conducted iterative discussions and presentations at GRADE working group meetings to produce this guidance. RESULTS: Before rating the certainty of evidence, GRADE users should define the target of their certainty rating. GRADE recommends setting judgment thresholds defining what they consider a very accurate, accurate, inaccurate, and very inaccurate test. These thresholds should be set after considering consequences of testing and effects on people-important outcomes. GRADE's primary criterion for judging imprecision in test accuracy evidence is considering confidence intervals (i.e., CI approach) of absolute test accuracy results (true and false, positive, and negative results in a cohort of people). Based on the CI approach, when a CI appreciably crosses the predefined judgment threshold(s), one should consider rating down certainty of evidence by one or more levels, depending on the number of thresholds crossed. When the CI does not cross judgment threshold(s), GRADE suggests considering the sample size for an adequately powered test accuracy review (optimal or review information size [optimal information size (OIS)/review information size (RIS)]) in rating imprecision. If the combined sample size of the included studies in the review is smaller than the required OIS/RIS, one should consider rating down by one or more levels for imprecision. CONCLUSION: This paper extends previous GRADE guidance for rating imprecision in single test accuracy systematic reviews and guidelines, with a focus on the circumstances in which one should consider rating down one or more levels for imprecision.