Title
A two-stage complex network using cycle-consistent generative adversarial networks for speech enhancement
Abstract
Cycle-consistent generative adversarial networks (CycleGAN) have shown their promising performance for speech enhancement (SE), while one intractable shortcoming of these CycleGAN-based SE systems is that the noise components propagate throughout the cycle and cannot be completely eliminated. Additionally, conventional CycleGAN-based SE systems only estimate the spectral magnitude, while the phase is unaltered. Motivated by the multi-stage learning concept, we propose a novel two-stage denoising system that combines a CycleGAN-based magnitude enhancing network and a subsequent complex spectral refining network in this paper. Specifically, in the first stage, a CycleGAN-based model is responsible for only estimating magnitude, which is subsequently coupled with the original noisy phase to obtain a coarsely enhanced complex spectrum. After that, the second stage is applied to further suppress the residual noise components and estimate the clean phase by a complex spectral mapping network, which is a pure complex-valued network composed of complex 2D convolution/deconvolution and complex temporal-frequency attention blocks. Experimental results on two public datasets demonstrate that the proposed approach consistently surpasses previous one-stage CycleGANs and other state-of-the-art SE systems in terms of various evaluation metrics, especially in background noise suppression.
Year
DOI
Venue
2021
10.1016/j.specom.2021.09.001
SPEECH COMMUNICATION
Keywords
DocType
Volume
Speech enhancement, Cycle-consistent generative adversarial network, Multi-stage learning, Complex spectral mapping, Deep complex network
Journal
134
ISSN
Citations 
PageRank 
0167-6393
0
0.34
References 
Authors
0
5
Name
Order
Citations
PageRank
Guochen Yu123.09
Yutian Wang204.06
Hui Wang342.86
Qin Zhang494.44
Chengshi Zheng53211.66