Title
LEARNING DISENTANGLED FEATURE REPRESENTATIONS FOR SPEECH ENHANCEMENT VIA ADVERSARIAL TRAINING
Abstract
Neural speech enhancement degrades significantly in face of unseen noise. To address such mismatch, we propose to learn noise-agnostic feature representations by disentanglement learning, which removes the unspecified noise factor, while keeping the specified factors of variation associated with the clean speech. Specifically, a discriminator module is introduced to distinguish the type of noises, which is referred to as the disentangler. With the adversarial training strategy, a gradient reversal layer seeks to disentangle the noise factor and remove it from the feature representation. Experiment results show that the proposed approach achieves 5.8% and 5.2% relative improvements over the best baseline in terms of perceptual evaluation of the speech quality (PESQ) and segmental signal-to-noise ratio (SSNR), respectively. The ablation study indicates that the proposed disentangler module is also effective in other encoder-decoder-like structures.
Year
DOI
Venue
2021
10.1109/ICASSP39728.2021.9413512
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021)
Keywords
DocType
Citations 
Disentangled feature learning, adversarial training, speech enhancement
Conference
0
PageRank 
References 
Authors
0.34
7
4
Name
Order
Citations
PageRank
Nana Hou122.42
Chenglin Xu2208.30
Eng Siong Chng3970106.33
Haizhou Li43678334.61