Title
Automatic Intelligibility Assessment of Dysarthric Speech Using Phonologically-Structured Sparse Linear Model
Abstract
This paper presents a new method for automatically assessing the speech intelligibility of patients with dysarthria, which is a motor speech disorder impeding the physical production of speech. The proposed method consists of two main steps: feature representation and prediction. In the feature representation step, the speech utterance is converted into a phone sequence using an automatic speech recognition technique and is then aligned with a canonical phone sequence from a pronunciation dictionary using a weighted finite state transducer to capture the pronunciation mappings such as match, substitution, and deletion. The histograms of the pronunciation mappings on a pre-defined word set are used for features. Next, in the prediction step, a structured sparse linear model incorporated with phonological knowledge that simultaneously addresses phonologically structured sparse feature selection and intelligibility prediction is proposed. Evaluation of the proposed method on a database of 109 speakers consisting of 94 dysarthric and 15 control speakers yielded a root mean square error of 8.14 compared to subjectively rated scores in the range of 0 to 100. This is a promising performance in which the system can be successfully applied to help speech therapists in diagnosing the degree of speech disorder.
Year
DOI
Venue
2015
10.1109/TASLP.2015.2403619
IEEE/ACM Transactions on Audio, Speech & Language Processing
Keywords
Field
DocType
finite state machines,pronunciation confusion network,medical disorders,speech intelligibility assessment,speech recognition,pronunciation dictionary,speech intelligibility,sparse matrices,weighted finite state transducer,speech utterance,automatic speech recognition technique,weighted finite state transducer (wfst),physical speech production,motor speech disorder,intelligibility prediction,canonical phone sequence,dysarthria,automatic dysarthric speech intelligibility assessment,pronunciation mapping histograms,phonologically-structured sparse linear model,structured sparse model,speech,phonologically structured sparse feature selection,transducers,predictive models,speech processing
Pronunciation,Speech processing,Feature selection,Computer science,Utterance,Speech recognition,Natural language processing,Artificial intelligence,Speech disorder,Motor speech disorders,Dysarthria,Intelligibility (communication)
Journal
Volume
Issue
ISSN
23
4
2329-9290
Citations 
PageRank 
References 
5
0.48
23
Authors
3
Name
Order
Citations
PageRank
Myung Jong Kim1316.30
Younggwan Kim2176.11
Hoi-Rin Kim310220.64