LSTM time and frequency recurrence for automatic speech recognition - Citegraph

Paper Info

Title
LSTM time and frequency recurrence for automatic speech recognition

Abstract
Long short-term memory (LSTM) recurrent neural networks (RNNs) have recently shown significant performance improvements over deep feed-forward neural networks (DNNs). A key aspect of these models is the use of time recurrence, combined with a gating architecture that ameliorates the vanishing gradient problem. Inspired by human spectrogram reading, in this paper we propose an extension to LSTMs that performs the recurrence in frequency as well as in time. This model first scans the frequency bands to generate a summary of the spectral information, and then uses the output layer activations as the input to a traditional time LSTM (T-LSTM). Evaluated on a Microsoft short message dictation task, the proposed model obtained a 3.6% relative word error rate reduction over the T-LSTM.

Year	DOI	Venue
2015	10.1109/ASRU.2015.7404793	2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)
Keywords	Field	DocType
LSTM,RNN,time and frequency	Gating,Pattern recognition,Computer science,Spectrogram,Word error rate,Recurrent neural network,Speech recognition,Dictation,Artificial intelligence,Artificial neural network,Vanishing gradient problem	Conference
Citations	PageRank	References
10	0.52	14
Authors
4

Authors (4 rows)

Cited by (10 rows)

References (14 rows)

Name	Order	Citations	PageRank
Jinyu Li	1	915	72.84
Abdel-rahman Mohamed	2	3772	266.13
Geoffrey Zweig	3	3406	320.25
Yifan Gong	4	1332	135.58

1