Title
Towards End-To-End Speech Recognition For Chinese Mandarin Using Long Short-Term Memory Recurrent Neural Networks
Abstract
End-to-end speech recognition systems have been successfully designed for English. Taking into account the distinctive characteristics between Chinese Mandarin and English, it is worthy to do some additional work to transfer these approaches to Chinese. In this paper, we attempt to build a Chinese speech recognition system using end-to-end learning method. The system is based on a combination of deep Long Short-Term Memory Projected (LSTMP) network architecture and the Connectionist Temporal Classification objective function (CTC). The Chinese characters (the number is about 6,000) are used as the output labels directly. To integrate language model information during decoding, the CTC Beam Search method is adopted and optimized to make it more effective and more efficient. We present the first-pass decoding results which are obtained by decoding from scratch using CTC-trained network and language model. Although these results are not as good as the performance of DNN-HMMs hybrid system, they indicate that it is feasible to choose Chinese characters as the output alphabet in the end-to end speech recognition system.
Year
Venue
Keywords
2015
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5
Long Short-Term Memory, End-to-end, Connectionist Temporal Classification, speech recognition
Field
DocType
Citations 
Chinese characters,Computer science,Network architecture,Recurrent neural network,Beam search,Speech recognition,Decoding methods,Connectionism,Mandarin Chinese,Language model
Conference
5
PageRank 
References 
Authors
0.53
9
4
Name
Order
Citations
PageRank
J.X. Li1403113.63
Heng Zhang251.20
Xinyuan Cai350.86
Bo Xu424136.59