Title
Towards end-to-end speech recognition with transfer learning
Abstract
A transfer learning-based end-to-end speech recognition approach is presented in two levels in our framework. Firstly, a feature extraction approach combining multilingual deep neural network (DNN) training with matrix factorization algorithm is introduced to extract high-level features. Secondly, the advantage of connectionist temporal classification (CTC) is transferred to the target attention-based model through a joint CTC-attention model composed of shallow recurrent neural networks (RNNs) on top of the proposed features. The experimental results show that the proposed transfer learning approach achieved the best performance among all end-to-end methods and could be comparable to the state-of-the-art speech recognition system for TIMIT when further jointly decoded with a RNN language model.
Year
DOI
Venue
2018
10.1186/s13636-018-0141-9
EURASIP Journal on Audio, Speech, and Music Processing
Keywords
Field
DocType
Speech recognition,End-to-end,Transfer learning
TIMIT,Pattern recognition,Computer science,Matrix decomposition,Transfer of learning,Recurrent neural network,Feature extraction,Speech recognition,Artificial intelligence,Artificial neural network,Connectionism,Language model
Journal
Volume
Issue
ISSN
2018
1
1687-4722
Citations 
PageRank 
References 
3
0.45
19
Authors
3
Name
Order
Citations
PageRank
Chu-Xiong Qin130.45
Dan Qu2173.77
Lian-Hai Zhang330.45