Speaker-Independent Silent Speech Recognition With Across-Speaker Articulatory Normalization And Speaker Adaptive Training - Citegraph

Paper Info

Title
Speaker-Independent Silent Speech Recognition With Across-Speaker Articulatory Normalization And Speaker Adaptive Training

Abstract
Silent speech recognition (SSR) converts non-audio information (e.g., articulatory information) to speech. SSR has potential to enable laryngectomees to produce synthesized speech with a natural sounding voice. Despite its recent advances. current SSR research has largely relied on speaker-dependent recognition. High degree of variation in articulatory patterns across different talkers has been a barrier for developing effective speaker-independent SSR approaches. Speaker-independent approaches, however, are critical for reducing the large amount of training data required from each user; only limited articulatory samples are often available for individuals, due to the logistic difficulty of articulatory data collection. In this paper, we investigated speaker-independent silent speech recognition from tongue and lip movement data with two models that address the across-talker variation: Procrustes matching, a physiological approach, to minimize the across-talker physiological differences of articulators, and speaker adaptive training, a data-driven approach. A silent speech data set was collected using an electromagnetic articulograph (EMA) from five English speakers (while they were silently articulating phrases) and was used to evaluate the two speaker-independent SSR approaches. The long-standing Gaussian mixture model-hidden Markov models and recently available deep neural network hidden Markov model were used as the recognizers. Experimental results showed the effectiveness of both normalization approaches.

Year	Venue	Keywords
2015	16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5	silent speech recognition, Procrustes matching, speaker adaptive training, hidden Markov models, deep neural network
Field	DocType	Citations
Normalization (statistics),Pattern recognition,Computer science,Speech recognition,Speaker recognition,Artificial intelligence,Speaker diarisation,Hidden Markov model	Conference	6
PageRank	References	Authors
0.47	9	2

Authors (2 rows)

Cited by (6 rows)

References (9 rows)

Name	Order	Citations	PageRank
Jun Wang	1	144	15.26
Seongjun Hahm	2	73	8.20

1