Research on Mongolian Speech Recognition Based on FSMN. - Citegraph

Paper Info

Title
Research on Mongolian Speech Recognition Based on FSMN.

Abstract
Deep Neural Network (DNN) model has been achieved a significant result over the Mongolian speech recognition task, however, compared to Chinese, English or the others, there are still opportunities for further enhancements. This paper presents the first application of Feed-forward Sequential Memory Network (FSMN) for Mongolian speech recognition tasks to model long-term dependency in time series without using recurrent feedback. Furthermore, by modeling the speaker in the feature space, we extract the i-vector features and combine them with the Fbank features as the input to validate their effectiveness in Mongolian ASR tasks. Finally, discriminative training was firstly conducted over the FSMN by using maximum mutual information (MMI) and state-level minimum Bayes risk (sMBR), respectively. The experimental results show that: FSMN possesses better performance than DNN in the Mongolian ASR, and by using i-vector features combined with Fbank features as FSMN input and discriminative training, the word error rate (WER) is relatively reduced by 17.9% compared with the DNN baseline.

Year	DOI	Venue
2017	10.1007/978-3-319-73618-1_21	Lecture Notes in Artificial Intelligence
Keywords	DocType	Volume
Mongolian,Speech recognition,DNN,FSMN,i-vector,Sequence-criterion training	Conference	10619
ISSN	Citations	PageRank
0302-9743	0	0.34
References	Authors
0	4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Yonghe Wang	1	0	2.37
Fei Long	2	16	13.09
Hongwei Zhang	3	3	3.54
Guanglai Gao	4	78	24.57

1