Title | ||
---|---|---|
Speaker-Independent Silent Speech Recognition With Across-Speaker Articulatory Normalization And Speaker Adaptive Training |
Abstract | ||
---|---|---|
Silent speech recognition (SSR) converts non-audio information (e.g., articulatory information) to speech. SSR has potential to enable laryngectomees to produce synthesized speech with a natural sounding voice. Despite its recent advances. current SSR research has largely relied on speaker-dependent recognition. High degree of variation in articulatory patterns across different talkers has been a barrier for developing effective speaker-independent SSR approaches. Speaker-independent approaches, however, are critical for reducing the large amount of training data required from each user; only limited articulatory samples are often available for individuals, due to the logistic difficulty of articulatory data collection. In this paper, we investigated speaker-independent silent speech recognition from tongue and lip movement data with two models that address the across-talker variation: Procrustes matching, a physiological approach, to minimize the across-talker physiological differences of articulators, and speaker adaptive training, a data-driven approach. A silent speech data set was collected using an electromagnetic articulograph (EMA) from five English speakers (while they were silently articulating phrases) and was used to evaluate the two speaker-independent SSR approaches. The long-standing Gaussian mixture model-hidden Markov models and recently available deep neural network hidden Markov model were used as the recognizers. Experimental results showed the effectiveness of both normalization approaches. |
Year | Venue | Keywords |
---|---|---|
2015 | 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | silent speech recognition, Procrustes matching, speaker adaptive training, hidden Markov models, deep neural network |
Field | DocType | Citations |
Normalization (statistics),Pattern recognition,Computer science,Speech recognition,Speaker recognition,Artificial intelligence,Speaker diarisation,Hidden Markov model | Conference | 6 |
PageRank | References | Authors |
0.47 | 9 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jun Wang | 1 | 144 | 15.26 |
Seongjun Hahm | 2 | 73 | 8.20 |