Both the Standard Protocol Items: Recommendations for Interventional Trials (SPIRIT) and Consolidated Standards of Reporting Trials (CONSORT) guidelines recommend that clinical trials follow a study framework that aligns with their objective to test the relative efficacy or safety (equality) or effectiveness (superiority, noninferiority, or equivalence) between interventions. We conducted a systematic review to assess the proportion of studies that demonstrated inconsistency between the framing of their research question, sample size calculation, and conclusion and those that should have framed their research question differently based on the compared interventions.
We included studies from 5 high-impact-factor orthopaedic journals published in 2017 and 2019 that compared at least 2 interventions using patient-reported outcome measures.
We included 228 studies. The sample size calculation was reported in 60.5% (n = 138) of studies. Of these, 52.2% (n = 72) were inconsistent between the framing of their research question, sample size calculation, and conclusion. The majority (n = 137) of sample size calculations were for equality, but 43.8% of these studies concluded superiority, noninferiority, or equivalence. Studies that framed their research question as equality (n = 186) should have been framed as superiority (n = 129), equivalence (n = 52), or noninferiority (n = 3). Only 2 studies correctly framed their research question as equality.
Studies published in high-impact journals were inconsistent between the framing of their research question, sample size calculation, and conclusion. Authors may be misinterpreting research findings and making clinical recommendations solely based on p values. Researchers are encouraged to state and justify their methodological framework and choice of margin(s) in a publicly published protocol as they have implications for sample size and the applicability of conclusions.
The results of clinical research must be interpreted using confidence intervals, with careful consideration as to how the confidence intervals relate to clinically meaningful differences in outcomes between treatments. The more typical practice of relying on p values leaves the clinician at high risk of erroneous interpretation, recommendation, and/or action.