Endpoint Detection Using Grid Long Short-Term Memory Networks For Streaming Speech Recognition - Citegraph

Paper Info

Title
Endpoint Detection Using Grid Long Short-Term Memory Networks For Streaming Speech Recognition

Abstract
The task of endpointing is to determine when the user has finished speaking. This is important for interactive speech applications such as voice search and Google Home. In this paper, we propose a GLDNN-based (grid long short-term memory deep neural network) endpointer model and show that it provides significant improvements over a state-of-the-art CLDNN (convolutional, long short-term memory. deep neural network) model. Specifically. we replace the convolution layer in the CLDNN with a grid LSTM layer that models both spectral and temporal variations through recurrent connections. Results show that the GLDNN achieves 32% relative improvement in false alarm rate at a fixed false reject rate of 2%, and reduces median latency by 11%. We also include detailed experiments investigating why grid LSTMs offer better performance than convolution layers. Analysis reveals that the recurrent connection along the frequency axis is an important factor that greatly contributes to the performance of grid LSTMs, especially in the presence of background noise. Finally, we also show that multichannel input further increases robustness to background speech. Overall. we achieve 16% (100 ms) endpointer latency improvement relative to our previous best model on a Voice Search Task.

Year	DOI	Venue
2017	10.21437/Interspeech.2017-284	18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION
Field	DocType	ISSN
Pattern recognition,Voice activity detection,Computer science,Long short term memory,Speech recognition,Real-time computing,Artificial intelligence,Grid	Conference	2308-457X
Citations	PageRank	References
0	0.34	6
Authors
5

Authors (5 rows)

Cited by (0 rows)

References (6 rows)

Name	Order	Citations	PageRank
Shuo-Yiin Chang	1	27	4.71
Bo Li	2	206	42.46
Tara N. Sainath	3	3497	232.43
Gabor Simko	4	42	7.06
Carolina Parada	5	242	13.11

1