Reliable assessment of operative performance

BACKGROUND: There is no consensus regarding the number of intraoperative assessments required to reliably measure trainee performance. This study used generalizability theory (GT) to describe factors contributing to score variance and to estimate the number of assessments needed to achieve high standards of reliability. METHODS: While performing laparoscopic procedures, trainees were assessed by the attending surgeon using Global Operative Assessment of Laparoscopic Skills (GOALS). Data were collected prospectively (2-month intervals), assessing each trainee multiple times. Reliability coefficient was calculated using trainees, cases, and raters as factors. RESULTS: Eighteen trainees were included for a total of 65 assessments. Total variance in scores was accounted for as follows: 66.1% by trainees, 31.6% by the interaction between trainees and cases, and 2.3% by raters. At least 3 cases are required for reliable scores using GOALS. CONCLUSIONS: Trainees accounted for most of the variance in GOALS scores with a minimum of 3 cases required to improve the reliability of the scores obtained. These data may guide the implementation of performance assessments in surgical training programs.

Reliable assessment of operative performance Journal Articles