A Machine-Learning Algorithm for the Automated Perceptual Evaluation of Dysphonia Severity

abstract

OBJECTIVES: Auditory-perceptual assessments are the gold standard for assessing voice quality. This project aims to develop a machine-learning model for measuring perceptual dysphonia severity of audio samples consistent with assessments by expert raters. METHODS: The Perceptual Voice Qualities Database samples were used, including sustained vowel and Consensus Auditory-Perceptual Evaluation of Voice sentences, which were previously expertly rated on a 0-100 scale. The OpenSMILE (audEERING GmbH, Gilching, Germany) toolkit was used to extract acoustic (Mel-Frequency Cepstral Coefficient-based, n = 1428) and prosodic (n = 152) features, pitch onsets, and recording duration. We utilized a support vector machine and these features (n = 1582) for automated assessment of dysphonia severity. Recordings were separated into vowels (V) and sentences (S) and features were extracted separately from each. Final voice quality predictions were made by combining the features extracted from the individual components with the whole audio (WA) sample (three file sets: S, V, WA). RESULTS: This algorithm has a high correlation (r = 0.847) with estimates of expert raters. The root mean square error was 13.36. Increasing signal complexity resulted in better estimation of dysphonia, whereby combining the features outperformed WA, S, and V sets individually. CONCLUSION: A novel machine-learning algorithm was able to perform perceptual estimates of dysphonia severity using standardized audio samples on a 100-point scale. This was highly correlated to expert raters. This suggests that ML algorithms could offer an objective method for evaluating voice samples for dysphonia severity.

authors

van der Woerd, Benjamin
Chen, Zhuohao
Flemotomos, Nikolaos
Oljaca, Maria
Sund, Lauren Timmons
Narayanan, Shrikanth
Johns, Michael M

status

published

publication date

July 2023

has subject area

1103 Clinical Sciences (FoR)
1904 Performing Arts and Creative Writing (FoR)
Speech-Language Pathology & Audiology (Science Metrix)

published in

Journal of Voice Journal

A Machine-Learning Algorithm for the Automated Perceptual Evaluation of Dysphonia Severity Journal Articles

Overview

abstract

authors

status

publication date

has subject area

published in

Research

keywords

Identity

Digital Object Identifier (DOI)

PubMed ID

Additional Document Info

start page