Title
Gated convolutional networks based hybrid acoustic models for low resource speech recognition
Abstract
In acoustic modeling for large vocabulary speech recognition, recurrent neural networks (RNN) have shown great abilities to model temporal dependencies. However, the performance of RNN is not prominent in resource limited tasks, even worse than the traditional feedforward neural networks (FNN). Furthermore, training time for RNN is much more than that for FNN. In recent years, some novel models are provided. They use non-recurrent architectures to model long term dependencies. In these architectures, they show that using gate mechanism is an effective method to construct acoustic models. On the other hand, it has been proved that using convolution operation is a good method to learn acoustic features. We hope to take advantages of both these two methods. In this paper we present a gated convolutional approach to low resource speech recognition tasks. The gated convolutional networks use convolutional architectures to learn input features and a gate to control information. Experiments are conducted on the OpenKWS, a series of low resource keyword search evaluations. From the results, the gated convolutional networks relatively decrease the WER about 6% over the baseline LSTM models, 5% over the DNN models and 3% over the BLSTM models. In addition, the new models accelerate the learning speed by more than 1.8 and 3.2 times compared to that of the baseline LSTM and BLSTM models.
Year
DOI
Venue
2017
10.1109/ASRU.2017.8268930
2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
Keywords
DocType
ISBN
gated convolutional networks,nonrecurrent architectures,low resource speech recognition
Conference
978-1-5090-4789-5
Citations 
PageRank 
References 
1
0.37
0
Authors
3
Name
Order
Citations
PageRank
Jian Kang1152.66
Wei-Qiang Zhang213631.22
Jia Liu327750.34