Title
Smoothing along Frequency in Online Neural Network Supported Acoustic Beamforming
Abstract
We present a block-online multi-channel front end for automatic speech recognition in noisy and reverberated environments. It is an online version of our earlier proposed neural network supported acoustic beamformer, whose coefficients are calculated from noise and speech spatial covariance matrices which are estimated utilizing a neural mask estimator. However, the sparsity of speech in the STFT domain causes problems for the initial beamformer coefficients estimation in some frequency bins due to lack of speech observations. We propose two methods to mitigate this issue. The first is to lower the frequency resolution of the STFT, which comes with the additional advantage of a reduced time window, thus lowering the latency introduced by block processing. The second approach is to smooth beamforming coefficients along the frequency axis, thus exploiting their high interfrequency correlation. With both approaches the gap between offline and block-online beamformer performance, as measured by the word error rate achieved by a downstream speech recognizer, is significantly reduced. Experiments are carried out on two copora, representing noisy (CHiME-4) and noisy reverberant (voiceHome) environments.
Year
Venue
DocType
2018
Speech Communication; 13th ITG-Symposium
Conference
ISBN
Citations 
PageRank 
978-3-8007-4767-2
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Jens Heitkaemper142.50
Jahn Heymann210210.29
Reinhold Haeb-Umbach31487211.71