Title
Research on Transfer Learning for Khalkha Mongolian Speech Recognition Based on TDNN
Abstract
Automated speech recognition(ASR)incorporating Neural Networks with Hidden Markov Models (NNs/HMMs)have achieved the state-of-the-art in various benchmarks. Most of them use a large amount of training data. However, ASR research is still quite difficult in languages with limited resources, such as Khalkha Mongolian. Transfer learning methods have been shown to be effective utilizing out-of-domain data to improve ASR performance in similar data-scarce. In this paper, we investigate two different weight transfer approaches to improve the performance of Khalkha Mongolian ASR based on Lattice-free Maximum Mutual Information(LF-MMI). Moreover, the i-vector feature is used to combine with the MFCCs feature as the input to validate the effectiveness of Khalkha Mongolian ASR transfer models. Experimental results show that the weight transfer methods with out-of-domain Chahar speech can achieve great improvements over baseline model on Khalkha speech. And transferring parts of the model performs better than transferring the whole model. Furthermore, the i-vector spliced together with MFCCs as input features can further enhance the performance of the acoustic model. The WER of optimal model is relatively reduced by 10.96% compared with the in-of-domain Khalkha speech baseline model.
Year
DOI
Venue
2018
10.1109/IALP.2018.8629237
2018 International Conference on Asian Language Processing (IALP)
Keywords
Field
DocType
Mongolian,speech recognition,weight transfer
Data modeling,Weight transfer,Computer science,Transfer of learning,Speech recognition,Time delay neural network,Mutual information,Hidden Markov model,Artificial neural network,Acoustic model
Conference
ISSN
ISBN
Citations 
2159-1962
978-1-5386-8298-2
0
PageRank 
References 
Authors
0.34
0
4
Name
Order
Citations
PageRank
Linyan Shi100.34
Fei Long21613.09
Yonghe Wang302.37
Guanglai Gao47824.57