Evaluating the reliability of gestalt quality...

Evaluating the reliability of gestalt quality ratings of medical education podcasts: A METRIQ study

Abstract

IntroductionPodcasts are increasingly being used for medical education. Studies have found that the assessment of the quality of online resources can be challenging. We sought to determine the reliability of gestalt quality assessment of education podcasts in emergency medicine.MethodsAn international, interprofessional sample of raters was recruited through social media, direct contact, and the extended personal network of the study team. Each participant listened to eight podcasts (selected to include a variety of accents, number of speakers, and topics) and rated the quality of that podcast on a seven-point Likert scale. Phi coefficients were calculated within each group and overall. Decision studies were conducted using a phi of 0.8.ResultsA total of 240 collaborators completed all eight surveys and were included in the analysis. Attendings, medical students, and physician assistants had the lowest individual-level variance and thus the lowest number of required raters to reliably evaluate quality (phi >0.80). Overall, 20 raters were required to reliably evaluate the quality of emergency medicine podcasts.DiscussionGestalt ratings of quality from approximately 20 health professionals are required to reliably assess the quality of a podcast. This finding should inform future work focused on developing and validating tools to support the evaluation of quality in these resources.

Authors

Woods JM; Chan TM; Roland D; Riddell J; Tagg A; Thoma B

Journal

Perspectives on Medical Education, Vol. 9, No. 5, pp. 302–306

Publisher

Ubiquity Press

Publication Date

October 1, 2020

DOI

10.1007/s40037-020-00589-x

ISSN

1389-6555

Associated Experts

Teresa Chan

Clinical Professor, Faculty of Health Sciences

Visit profile

Labels