Title
Neural Spectrospatial Filtering
Abstract
As the most widely-used spatial filtering approach for multi-channel speech separation, beamforming extracts the target speech signal arriving from a specific direction. An emerging alternative approach is multi-channel complex spectral mapping, which trains a deep neural network (DNN) to directly estimate the real and imaginary spectrograms of the target speech signal from those of the multi-channel noisy mixture. In this all-neural approach, the trained DNN itself becomes a nonlinear, time-varying spectrospatial filter. However, it remains unclear how this approach performs relative to commonly-used beamforming techniques on different array configurations and acoustic environments. This paper is devoted to examining this issue in a systematic way. Comprehensive evaluations show that multi-channel complex spectral mapping achieves separation performance comparable to or better than beamforming for different array geometries and speech separation tasks and reduces to monaural complex spectral mapping in single-channel conditions, demonstrating the general utility of this approach on multi-channel and single-channel speech separation. In addition, such an approach is computationally more efficient than widely-used mask-based beamforming. We conclude that this neural spectrospatial filter provides a strong alternative to traditional and mask-based beamforming.
Year
DOI
Venue
2022
10.1109/TASLP.2022.3145319
IEEE/ACM Transactions on Audio, Speech, and Language Processing
Keywords
DocType
Volume
Beamforming,deep learning,multi-channel complex spectral mapping,spectrospatial filtering,speech separation
Journal
30
Issue
ISSN
Citations 
1
2329-9290
1
PageRank 
References 
Authors
0.37
15
3
Name
Order
Citations
PageRank
Tan Ke1409.22
Zhong-Qiu Wang2689.93
DeLiang Wang3492.71