Abstract | ||
---|---|---|
We present a novel deep Recurrent Neural Network (RNN) model for acoustic modelling in Automatic Speech Recognition (ASR). We term our contribution as a TC-DNN-BLSTM-DNN model, the model combines a Deep Neural Network (DNN) with Time Convolution (TC), followed by a Bidirectional Long Short-Term Memory (BLSTM), and a final DNN. The first DNN acts as a feature processor to our model, the BLSTM then generates a context from the sequence acoustic signal, and the final DNN takes the context and models the posterior probabilities of the acoustic states. We achieve a 3.47 WER on the Wall Street Journal (WSJ) eval92 task or more than 8% relative improvement over the baseline DNN models. |
Year | Venue | Field |
---|---|---|
2015 | CoRR | Convolution,Computer science,Recurrent neural network,Speech recognition,Posterior probability,Artificial intelligence,Artificial neural network,Machine learning |
DocType | Volume | Citations |
Journal | abs/1504.01482 | 7 |
PageRank | References | Authors |
0.60 | 4 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
William Chan | 1 | 357 | 24.67 |
Ian R. Lane | 2 | 259 | 33.64 |