Title
Applying Multi- and Cross-Lingual Stochastic Phone Space Transformations to Non-Native Speech Recognition
Abstract
In the context of hybrid HMM/MLP Automatic Speech Recognition (ASR), this paper describes an investigation into a new type of stochastic phone space transformation, which maps “source” phone (or phone HMM state) posterior probabilities (as obtained at the output of a Multilayer Perceptron/MLP) into “destination” phone (HMM phone state) posterior probabilities. The resulting stochastic matrix transformation can be used within the same language to automatically adapt to different phone formats (e.g., IPA) or across languages. Additionally, as shown here, it can also be applied successfully to non-native speech recognition. In the same spirit as MLLR adaptation, or MLP adaptation, the approach proposed here is directly mapping posterior distributions, and is trained by optimizing on a small amount of adaptation data a Kullback-Leibler based cost function, along a modified version of an iterative EM algorithm. On a non-native English database (HIWIRE), and comparing with multiple setups (monophone and triphone mapping, MLLR adaptation) we show that the resulting posterior mapping yields state-of-the-art results using very limited amounts of adaptation data in mono-, cross- and multi-lingual setups. We also show that “universal” phone posteriors, trained on a large amount of multilingual data, can be transformed to English phone posteriors, resulting in an ASR system that significantly outperforms a system trained on English data only. Finally, we demonstrate that the proposed approach outperforms alternative data-driven, as well as a knowledge-based, mapping techniques.
Year
DOI
Venue
2013
10.1109/TASL.2013.2260150
IEEE Transactions on Audio, Speech, and Language Processing
Keywords
Field
DocType
statistical distributions,probability,iterative methods,hidden markov models,speech recognition
Triphone,Pattern recognition,Iterative method,Computer science,Expectation–maximization algorithm,Speech recognition,Posterior probability,Phone,Probability distribution,Multilayer perceptron,Artificial intelligence,Hidden Markov model
Journal
Volume
Issue
ISSN
21
8
1558-7916
Citations 
PageRank 
References 
3
0.40
10
Authors
5
Name
Order
Citations
PageRank
David Imseng11048.42
Hervé Bourlard21932227.84
John Dines3363.39
Philip N. Garner430441.04
Mathew Magimai-Doss551654.76