Evaluating sport-for-development outcome measures used in a living lab setting: Process, improvements, and insights.
Journal Articles
Overview
Research
Identity
Additional Document Info
View All
Overview
abstract
BACKGROUND: Sport-for-development (SFD) is an innovative approach utilizing sport to foster positive physical, mental, and social outcomes among children and youth, particularly those from underserved backgrounds. Living labs, which emphasize participant-centered research conducted in natural, real-world environments, present unique challenges for outcome measurement, including reduced control over conditions, variability in participant engagement, and logistical issues that complicate standardized data collection. Further, there are few outcome measures that are developed for SFD measurement in living lab settings. For these reasons, outcome measurement in a living lab setting remains challenging. OBJECTIVE: Our objective was to evaluate a set of outcome measures that have been administered in a living lab setting to better understand their performance, reliability, and areas for improvement. METHODS: SFD programming was delivered in a living lab setting at a large facility located in an urban center in Toronto, Canada. We evaluated 11, self-reported, Likert-style outcome measures against 8 key metrics used in Classical Test Theory to understand (for example) floor-and-ceiling effects, inter-item correlations, internal consistency, and test-retest reliability. Data were collected from 2019 to 2024 across multiple cohorts aged 6-29 years, involving diverse SFD programs. RESULTS: Our analysis of 2656 questionnaire completions demonstrated strengths in data collection, including complete response rates with minimal missing data (91 % of outcome measures met missingness thresholds), yet also highlighted issues primarily related to single-item-endorsement and inter-item correlations (with 38 % and 19 % of outcome measures meeting these thresholds, respectively). These insights prompted iterative improvements to the evaluation tools, such as modifying Likert scale response formats to include more response categories (and thereby reducing the impact of binning of responses). CONCLUSIONS: Evaluating our outcome measures provided insight into how they can be improved for administration in a living-lab setting. The results emphasize the need for context-appropriate tools to effectively capture nuanced SFD program impacts and underscore the importance of ongoing validation to improve both research quality and practical implementation in living lab environments.