Title | ||
---|---|---|
LEARNING DISENTANGLED FEATURE REPRESENTATIONS FOR SPEECH ENHANCEMENT VIA ADVERSARIAL TRAINING |
Abstract | ||
---|---|---|
Neural speech enhancement degrades significantly in face of unseen noise. To address such mismatch, we propose to learn noise-agnostic feature representations by disentanglement learning, which removes the unspecified noise factor, while keeping the specified factors of variation associated with the clean speech. Specifically, a discriminator module is introduced to distinguish the type of noises, which is referred to as the disentangler. With the adversarial training strategy, a gradient reversal layer seeks to disentangle the noise factor and remove it from the feature representation. Experiment results show that the proposed approach achieves 5.8% and 5.2% relative improvements over the best baseline in terms of perceptual evaluation of the speech quality (PESQ) and segmental signal-to-noise ratio (SSNR), respectively. The ablation study indicates that the proposed disentangler module is also effective in other encoder-decoder-like structures. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1109/ICASSP39728.2021.9413512 | 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) |
Keywords | DocType | Citations |
Disentangled feature learning, adversarial training, speech enhancement | Conference | 0 |
PageRank | References | Authors |
0.34 | 7 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Nana Hou | 1 | 2 | 2.42 |
Chenglin Xu | 2 | 20 | 8.30 |
Eng Siong Chng | 3 | 970 | 106.33 |
Haizhou Li | 4 | 3678 | 334.61 |