Smoothing along Frequency in Online Neural Network Supported Acoustic Beamforming - Citegraph

Paper Info

Title
Smoothing along Frequency in Online Neural Network Supported Acoustic Beamforming

Abstract
We present a block-online multi-channel front end for automatic speech recognition in noisy and reverberated environments. It is an online version of our earlier proposed neural network supported acoustic beamformer, whose coefficients are calculated from noise and speech spatial covariance matrices which are estimated utilizing a neural mask estimator. However, the sparsity of speech in the STFT domain causes problems for the initial beamformer coefficients estimation in some frequency bins due to lack of speech observations. We propose two methods to mitigate this issue. The first is to lower the frequency resolution of the STFT, which comes with the additional advantage of a reduced time window, thus lowering the latency introduced by block processing. The second approach is to smooth beamforming coefficients along the frequency axis, thus exploiting their high interfrequency correlation. With both approaches the gap between offline and block-online beamformer performance, as measured by the word error rate achieved by a downstream speech recognizer, is significantly reduced. Experiments are carried out on two copora, representing noisy (CHiME-4) and noisy reverberant (voiceHome) environments.

Year	Venue	DocType
2018	Speech Communication; 13th ITG-Symposium	Conference
ISBN	Citations	PageRank
978-3-8007-4767-2	0	0.34
References	Authors
0	3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Jens Heitkaemper	1	4	2.50
Jahn Heymann	2	102	10.29
Reinhold Haeb-Umbach	3	1487	211.71

1