Title
Non-Uniform Mce Training Of Deep Long Short-Term Memory Recurrent Neural Networks For Keyword Spotting
Abstract
It has been shown in [1, 2] that improved performance can be achieved by formulating the keyword spotting as a non-uniform error automatic speech recognition problem. In this work, we discriminatively train a deep bidirectional long short-term memory (BLSTM)- hidden Markov model (HMM) based acoustic model with non-uniform boosted minimum classification error (BMCE) criterion which imposes more significant error cost on the keywords than those on the non-keywords. By introducing the BLSTM, the context information in both the past and the future are stored and updated to predict the desired output and the long-term dependencies within the speech signal are well captured. With non-uniform BMCE objective, the BLSTM is trained so that the recognition errors related to the keywords are remarkably reduced. The BLSTM is optimized using back propagation through time and stochastic gradient descent. The keyword spotting system is implemented within weighted finite state transducer framework. The proposed method achieves 5.49% and 7.37% absolute figure-of-merit improvements respectively over the BLSTM and the feedforward deep neural network baseline systems trained with cross-entropy criterion for the keyword spotting task on Switchboard-1 Release 2 dataset.
Year
DOI
Venue
2017
10.21437/Interspeech.2017-583
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION
Keywords
Field
DocType
automatic speech recognition, keyword spotting, long short-term memory, recurrent neural networks, acoustic modeling, discriminative training
Pattern recognition,Computer science,Recurrent neural network,Long short term memory,Speech recognition,Keyword spotting,Natural language processing,Artificial intelligence
Conference
ISSN
Citations 
PageRank 
2308-457X
1
0.36
References 
Authors
15
2
Name
Order
Citations
PageRank
Zhong Meng13314.95
Biing-Hwang Juang23388699.72