Title
Integrating multi-level linguistic knowledge with a unified framework for Mandarin speech recognition
Abstract
To improve the Mandarin large vocabulary continuous speech recognition (LVCSR), a unified framework based approach is introduced to exploit multi-level linguistic knowledge. In this framework, each knowledge source is represented by a Weighted Finite State Transducer (WFST), and then they are combined to obtain a so-called analyzer for integrating multi-level knowledge sources. Due to the uniform transducer representation, any knowledge source can be easily integrated into the analyzer, as long as it can be encoded into WFSTs. Moreover, as the knowledge in each level is modeled independently and the combination is processed in the model level, the information inherently in each knowledge source has a chance to be thoroughly exploited. By simulations, the effectiveness of the analyzer is investigated, and then a LVCSR system embedding the presented analyzer is evaluated. Experimental results reveal that this unified framework is an effective approach which significantly improves the performance of speech recognition with a 9.9% relative reduction of character error rate on the HUB-4 test set, a widely used Mandarin speech recognition task.
Year
DOI
Venue
2008
null
EMNLP
Keywords
Field
DocType
speech recognition,multi-level linguistic knowledge,effective approach,unified framework,so-called analyzer,mandarin speech recognition task,multi-level knowledge source,lvcsr system,knowledge source,large vocabulary continuous speech
Transducer,Computer science,Artificial intelligence,Natural language processing,Embedding,Word error rate,Exploit,Speech recognition,Linguistics,Spectrum analyzer,Vocabulary,Mandarin Chinese,Test set
Conference
Volume
Issue
ISSN
null
null
null
Citations 
PageRank 
References 
0
0.34
29
Authors
4
Name
Order
Citations
PageRank
Xinhao Wang15715.23
Jiazhong Nie2454.72
Dingsheng Luo34611.61
Xihong Wu427953.02