Title
Lightly supervised acoustic model training for mandarin continuous speech recognition
Abstract
This paper investigates a kind of lightly supervised acoustic model training method for Mandarin continues speech recognition system. The speech materials with rough transcription, which provide some light supervision for acoustic model training, are available in various forms these days. In this work, the quality problem of this kind of data is classified into two types: the first is non-speech and low-quality speech in the corpora, while the second is the transcription errors. A framework is proposed to tackle these two types separately: the speech recognition with transcription-relevant language model is adopted to remove the first type, while with general language model to provide candidate transcription errors which are checked by the final automatic verification process. The performance of proposed framework was evaluated from two aspects: the data quality has significantly improved, and the speech recognition results show that a 21.88% relative CER reduction was obtained.
Year
DOI
Venue
2012
10.1007/978-3-642-36669-7_88
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Keywords
Field
DocType
speech recognition,candidate transcription error,speech recognition result,speech recognition system,speech material,mandarin continuous speech recognition,supervised acoustic model training,general language model,low-quality speech,transcription-relevant language model,acoustic model training
Speech corpus,Data quality,Computer science,Speech recognition,Artificial intelligence,Natural language processing,Mandarin Chinese,Language model,Acoustic model
Conference
Volume
Issue
ISSN
7751 LNCS
null
16113349
Citations 
PageRank 
References 
1
0.38
7
Authors
3
Name
Order
Citations
PageRank
Xiangang Li15812.99
Zaihu Pang2111.96
Xihong Wu327953.02