Space-Time Residual LSTM Architechture for Distant Speech Recognition - Citegraph

Paper Info

Title
Space-Time Residual LSTM Architechture for Distant Speech Recognition

Abstract
Long Short-Term Memory (Plain-LSTM) is efficient for acoustic modeling in automatic speech recognition systems, but their training is obstructed by the vanishing and exploding gradient issues. To alleviate the problem, the paper introduces an improved space residual LSTM (S-RES-LSTM), which uses the output before not after the LSTM projection layer as spatial shortcut connection compared to the previous RES-LSTM. Experiments for distant speech recognition on the AMI SDM show that S-RES-LSTM can reach 5% absolute WER(over) and 5.9% absolute WER (non-over) reduction than the Plain-LSTM in 9- layer in eval. It also has 0.6% absolute WER reduction than the RES-LSTM in 9-layer. To further enhance the information flow for S-RES-LSTM, the space and time residual LSTM (ST-RES-LSTM) is proposed, which adds an innovational residual connection in the temporal dimension. The experiments show that compared with the Plain-LSTM and the RES-LSTM, ST-RES-LSTM achieves 5.5% absolute WER(over) degradation and 1% absolute WER(over) reduction respectively in 9-layer in eval.

Year	DOI	Venue
2018	10.1109/ISCSLP.2018.8706565	2018 11th International Symposium on Chinese Spoken Language Processing (ISCSLP)
Keywords	Field	DocType
Speech recognition,Training,Mathematical model,Hidden Markov models,Microphones,Acoustics,Neural networks	Space time,Residual,Pattern recognition,Computer science,Speech recognition,Artificial intelligence,Hidden Markov model,Artificial neural network	Conference
ISBN	Citations	PageRank
978-1-5386-5627-3	0	0.34
References	Authors
0	5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Long Wu	1	2	1.59
Li Wang	2	250	56.88
Pengyuan Zhang	3	50	19.46
Ta Li	4	2	2.06
Yonghong Yan 0002	5	83	19.58

1