Title
Segmentation-Based Mongolian Lvcsr Approach
Abstract
Mongolian is an agglutinative language. Each root can be followed by several suffixes to formulate new words. This special word formation characteristic results in probably millions of Mongolian words, which is far beyond the coverage of the pronunciation dictionary of any current Mongolian speech recognition system. Moreover, even if the pronunciation dictionary is large enough to cover all of the Mongolian words, the recognition system still cannot perform well due to the problem of sample sparseness. In this paper, we propose a segmentation-based Mongolian Large Vocabulary Continuous Speech Recognition (LVCSR) approach and rebuild the corresponding acoustic model and language model. Experimental results show that, by converting most of these words into their corresponding In-Vocabulary form, the proposed approach effectively recognizes most of the Mongolian words and greatly improves the sample sparseness problem in the language model.
Year
DOI
Venue
2013
10.1109/ICASSP.2013.6639250
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
Keywords
Field
DocType
Mongolian, segmentation, stem, ending suffix, LVCSR
Pronunciation,Word formation,Recognition system,Computer science,Segmentation,Agglutinative language,Speech recognition,Natural language processing,Artificial intelligence,Vocabulary,Language model,Acoustic model
Conference
Volume
Issue
ISSN
null
null
1520-6149
Citations 
PageRank 
References 
4
0.49
3
Authors
4
Name
Order
Citations
PageRank
Fei Long11613.09
Guanglai Gao27824.57
Xueliang Yan372.38
Wei-Hua Wang4428.06