Data-driven methods distort optimal cutoffs and accuracy estimates of depression screening tools: a simulation study using individual participant data. Journal Articles uri icon

  •  
  • Overview
  •  
  • Research
  •  
  • Identity
  •  
  • Additional Document Info
  •  
  • View All
  •  

abstract

  • OBJECTIVE: To evaluate, across multiple sample sizes, the degree that data-driven methods result in (1) optimal cutoffs different from population optimal cutoff and (2) bias in accuracy estimates. STUDY DESIGN AND SETTING: A total of 1,000 samples of sample size 100, 200, 500 and 1,000 each were randomly drawn to simulate studies of different sample sizes from a database (n = 13,255) synthesized to assess Edinburgh Postnatal Depression Scale (EPDS) screening accuracy. Optimal cutoffs were selected by maximizing Youden's J (sensitivity+specificity-1). Optimal cutoffs and accuracy estimates in simulated samples were compared to population values. RESULTS: Optimal cutoffs in simulated samples ranged from ≥ 5 to ≥ 17 for n = 100, ≥ 6 to ≥ 16 for n = 200, ≥ 6 to ≥ 14 for n = 500, and ≥ 8 to ≥ 13 for n = 1,000. Percentage of simulated samples identifying the population optimal cutoff (≥ 11) was 30% for n = 100, 35% for n = 200, 53% for n = 500, and 71% for n = 1,000. Mean overestimation of sensitivity and underestimation of specificity were 6.5 percentage point (pp) and -1.3 pp for n = 100, 4.2 pp and -1.1 pp for n = 200, 1.8 pp and -1.0 pp for n = 500, and 1.4 pp and -1.0 pp for n = 1,000. CONCLUSIONS: Small accuracy studies may identify inaccurate optimal cutoff and overstate accuracy estimates with data-driven methods.

authors

  • Rice, Danielle
  • Bhandari, Parash Mani
  • Levis, Brooke
  • Neupane, Dipika
  • Patten, Scott B
  • Shrier, Ian
  • Thombs, Brett D
  • Benedetti, Andrea
  • Depression Screening Data (DEPRESSD) EPDS Group

publication date

  • September 2021