Title
Neural network based spectral mask estimation for acoustic beamforming.
Abstract
We present a neural network based approach to acoustic beamforming. The network is used to estimate spectral masks from which the Cross-Power Spectral Density matrices of speech and noise are estimated, which in turn are used to compute the beamformer coefficients. The network training is independent of the number and the geometric configuration of the microphones. We further show that it is possible to train the network on clean speech only, avoiding the need for stereo data with separated speech and noise. Two types of networks are evaluated. One small feed-forward network with only one hidden layer and one more elaborated bi-directional Long Short-Term Memory network. We compare our system with different parametric approaches to mask estimation and using different beamforming algorithms. We show that our system yields superior results, both in terms of perceptual speech quality and with respect to speech recognition error rate. The results for the simple feed-forward network are especially encouraging considering its low computational requirements.
Year
DOI
Venue
2016
10.1109/ICASSP.2016.7471664
ICASSP
Keywords
Field
DocType
Acoustic Beam-forming,Deep Neural Network,Feature Enhancement,Robust Speech Recognition
Speech processing,Beamforming,Pattern recognition,Computer science,Word error rate,Spectral mask,Time delay neural network,Parametric statistics,Artificial intelligence,Artificial neural network,Linear predictive coding
Conference
ISSN
Citations 
PageRank 
1520-6149
39
1.95
References 
Authors
16
3
Name
Order
Citations
PageRank
Jahn Heymann110210.29
Lukas Drude29511.10
Reinhold Haeb-Umbach31487211.71