Over-reliance on P Values in Urology: Fragility of Findings in the Hydronephrosis Literature Calls for Systematic Reporting of Robustness Indicators
- Additional Document Info
- View All
OBJECTIVE: To review the robustness of hydronephrosis literature with the application of fragility index (FI) and fragility quotient (FQ) calculations. METHODS: A literature review was conducted using Pubmed, Medline, and Ovid for "hydronephrosis" and associated terms and we included all studies with at least 2 groups being compared. FI was calculated by populating study results into a 2-by-2 contingency table and generating a P value using Fisher's exact test. Next, events were manually added to the group with the fewest events, while removing a nonevent from the same group and Fisher's exact test repeated until the P value was >.05. FQ was calculated by dividing FI by the total sample size. RESULTS: The 130 included articles were published between 1986 and 2018 in 32 journals. Median citation count was 14 (0-252), 30% were RCTs and most papers originated in the United States (28%), Turkey(10%), and Canada(9%). Median FI was 2 (1-112), FQ was 0.023 (0.0010-0.55), and 60 papers (46%) had a FI of 1, indicating extremely fragile results. There was a significant difference in the FI between observational studies and RCTs (10 ± 17 vs 4 ± 5; P = .02); however, there was no difference in FQ (0.032 ± 0.030 vs 0.053 ± 0.080; P = .09) between them. CONCLUSION: Nearly half of studies in hydronephrosis literature reporting significant results are extremely fragile, requiring addition of only a couple of events in 1 treatment arm to significantly modify the results. As such, objective reporting of robustness of results should include FI and FQ which may help diminish over-reliance on P values as the main indicator of clinical significance in comparative studies.
has subject area