Title
Space-Time Residual LSTM Architechture for Distant Speech Recognition
Abstract
Long Short-Term Memory (Plain-LSTM) is efficient for acoustic modeling in automatic speech recognition systems, but their training is obstructed by the vanishing and exploding gradient issues. To alleviate the problem, the paper introduces an improved space residual LSTM (S-RES-LSTM), which uses the output before not after the LSTM projection layer as spatial shortcut connection compared to the previous RES-LSTM. Experiments for distant speech recognition on the AMI SDM show that S-RES-LSTM can reach 5% absolute WER(over) and 5.9% absolute WER (non-over) reduction than the Plain-LSTM in 9- layer in eval. It also has 0.6% absolute WER reduction than the RES-LSTM in 9-layer. To further enhance the information flow for S-RES-LSTM, the space and time residual LSTM (ST-RES-LSTM) is proposed, which adds an innovational residual connection in the temporal dimension. The experiments show that compared with the Plain-LSTM and the RES-LSTM, ST-RES-LSTM achieves 5.5% absolute WER(over) degradation and 1% absolute WER(over) reduction respectively in 9-layer in eval.
Year
DOI
Venue
2018
10.1109/ISCSLP.2018.8706565
2018 11th International Symposium on Chinese Spoken Language Processing (ISCSLP)
Keywords
Field
DocType
Speech recognition,Training,Mathematical model,Hidden Markov models,Microphones,Acoustics,Neural networks
Space time,Residual,Pattern recognition,Computer science,Speech recognition,Artificial intelligence,Hidden Markov model,Artificial neural network
Conference
ISBN
Citations 
PageRank 
978-1-5386-5627-3
0
0.34
References 
Authors
0
5
Name
Order
Citations
PageRank
Long Wu121.59
Li Wang225056.88
Pengyuan Zhang35019.46
Ta Li422.06
Yonghong Yan 000258319.58