Title
End-To-End Dysarthric Speech Recognition Using Multiple Databases
Abstract
We present in this paper an end-to-end automatic speech recognition (ASR) system for a person with an articulation disorder resulting from athetoid cerebral palsy. In the case of a person with this type of articulation disorder, the speech style is quite different from that of a physically unimpaired person, and the amount of their speech data available to train the model is limited because their burden is large due to strain on the speech muscles. Therefore, the performance of ASR systems for people with an articulation disorder degrades significantly. In this paper, we propose an end-to-end ASR framework trained by not only the speech data of a Japanese person with an articulation disorder but also the speech data of a physically unimpaired Japanese person and a non-Japanese person with an articulation disorder to relieve the lack of training data of a target speaker. An end-to-end ASR model encapsulates an acoustic and language model jointly. In our proposed model, an acoustic model portion is shared between persons with dysarthria, and a language model portion is assigned to each language regardless of dysarthria. Experimental results show the merit of our proposed approach of using multiple databases for speech recognition.
Year
DOI
Venue
2019
10.1109/icassp.2019.8683803
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
Keywords
Field
DocType
Speech recognition, multilingual, assistive technology, end-to-end model, dysarthria
Training set,Athetoid cerebral palsy,Dysarthric speech,Computer science,End-to-end principle,Dysarthria,Language model,Database,Acoustic model
Conference
ISSN
Citations 
PageRank 
1520-6149
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
yuki takashima143.84
Tetsuya Takiguchi2858.77
Yasuo Ariki351988.94