Home
Scholarly Works
Regularized Speaker Adaptation of KL-HMM for...
Journal article

Regularized Speaker Adaptation of KL-HMM for Dysarthric Speech Recognition

Abstract

This paper addresses the problem of recognizing the speech uttered by patients with dysarthria, which is a motor speech disorder impeding the physical production of speech. Patients with dysarthria have articulatory limitation, and therefore, they often have trouble in pronouncing certain sounds, resulting in undesirable phonetic variation. Modern automatic speech recognition systems designed for regular speakers are ineffective for dysarthric sufferers due to the phonetic variation. To capture the phonetic variation, Kullback-Leibler divergence-based hidden Markov model (KL-HMM) is adopted, where the emission probability of state is parameterized by a categorical distribution using phoneme posterior probabilities obtained from a deep neural network-based acoustic model. To further reflect speaker-specific phonetic variation patterns, a speaker adaptation method based on a combination of L2 regularization and confusion-reducing regularization, which can enhance discriminability between categorical distributions of the KL-HMM states while preserving speaker-specific information is proposed. Evaluation of the proposed speaker adaptation method on a database of several hundred words for 30 speakers consisting of 12 mildly dysarthric, 8 moderately dysarthric, and 10 non-dysarthric control speakers showed that the proposed approach significantly outperformed the conventional deep neural network-based speaker adapted system on dysarthric as well as non-dysarthric speech.

Authors

Kim M; Kim Y; Yoo J; Wang J; Kim H

Journal

IEEE Transactions on Neural Systems and Rehabilitation Engineering, Vol. 25, No. 9, pp. 1581–1591

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Publication Date

September 1, 2017

DOI

10.1109/tnsre.2017.2681691

ISSN

1534-4320
View published work (Non-McMaster Users)

Contact the Experts team