Title
Using neural network front-ends on far field multiple microphones based speech recognition
Abstract
This paper presents an investigation of far field speech recognition using beamforming and channel concatenation in the context of Deep Neural Network (DNN) based feature extraction. While speech enhancement with beamforming is attractive, the algorithms are typically signal-based with no information about the special properties of speech. A simple alternative to beamforming is concatenating multiple channel features. Results presented in this paper indicate that channel concatenation gives similar or better results. On average the DNN front-end yields a 25% relative reduction in Word Error Rate (WER). Further experiments aim at including relevant information in training adapted DNN features. Augmenting the standard DNN input with the bottleneck feature from a Speaker Aware Deep Neural Network (SADNN) shows a general advantage over the standard DNN based recognition system, and yields additional improvements for far field speech recognition.
Year
DOI
Venue
2014
10.1109/ICASSP.2014.6854663
ICASSP
Keywords
DocType
ISSN
dnn based recognition system,deep neural networks,sadnn,speech recognition,word error rate,multiple channel feature concatenation,microphones,beamforming,array signal processing,feature extraction,speaker aware deep neural network,neural network front-ends,speech enhancement,multiple microphone,far field multiple microphones,neural nets,wer,speech processing,hidden markov models,speech
Conference
1520-6149
Citations 
PageRank 
References 
28
0.90
18
Authors
3
Name
Order
Citations
PageRank
Yulan Liu1484.19
Pengyuan Zhang25019.46
Thomas Hain310514.91