Interpreting GRADE's levels of certainty or quality of the evidence: GRADE for statisticians, considering review information size or less emphasis on imprecision?
Additional Document Info
This article responds to issues raised by Antilla et al. in the Journal of Clinical Epidemiology about the Grading of Recommendations Assessment, Development and Evaluation (GRADE) Working Group's approach to rating imprecision and GRADE's use of statistics. They argue that GRADE confuses statistical terms and should provide a stepwise rating of imprecision for making decisions. Here, a clarification of those perceptions is provided. GRADE's ratings of imprecision and other quality of evidence domains is an iterative process that may or may not consider people important thresholds of effects when systematic review authors rate imprecision. Regardless of ratings in systematic reviews, those suggesting decisions such as guideline panels, should consider if they agree or need to revise these suggested thresholds to make informed ratings about imprecision. Decision relevant thresholds are the result of a complex interplay between critical outcomes for a decision-making. The certainty in the evidence of one critical outcome and the resulting possible certainty range, which I conceptualize in this article, may influence ratings of other outcomes. To relieve systematic review authors of the often challenging burden of defining worthwhile or important effects for judging precision based on the optimal information size (OIS), a modified OIS or review information size (RIS) could be used to rate imprecision at the systematic review stage. The RIS focuses only on plausible rather plausible and worthwhile effects. The advantages of using the RIS include avoiding the reliance on statistical significance alone and the varying thresholds resulting from the importance and the baseline risk of the outcome on which the OIS relies. Finally, I argue that GRADE's certainty in the evidence is related to the statistical definition of accuracy but given GRADE's broad application to other ratings of certainty such as qualitative evidence, statistical accuracy does not serve as a definition for GRADE's quality or certainty in the evidence.