LEARNING DISENTANGLED FEATURE REPRESENTATIONS FOR SPEECH ENHANCEMENT VIA ADVERSARIAL TRAINING - Citegraph

Paper Info

Title
LEARNING DISENTANGLED FEATURE REPRESENTATIONS FOR SPEECH ENHANCEMENT VIA ADVERSARIAL TRAINING

Abstract
Neural speech enhancement degrades significantly in face of unseen noise. To address such mismatch, we propose to learn noise-agnostic feature representations by disentanglement learning, which removes the unspecified noise factor, while keeping the specified factors of variation associated with the clean speech. Specifically, a discriminator module is introduced to distinguish the type of noises, which is referred to as the disentangler. With the adversarial training strategy, a gradient reversal layer seeks to disentangle the noise factor and remove it from the feature representation. Experiment results show that the proposed approach achieves 5.8% and 5.2% relative improvements over the best baseline in terms of perceptual evaluation of the speech quality (PESQ) and segmental signal-to-noise ratio (SSNR), respectively. The ablation study indicates that the proposed disentangler module is also effective in other encoder-decoder-like structures.

Year	DOI	Venue
2021	10.1109/ICASSP39728.2021.9413512	2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021)
Keywords	DocType	Citations
Disentangled feature learning, adversarial training, speech enhancement	Conference	0
PageRank	References	Authors
0.34	7	4

Authors (4 rows)

Cited by (0 rows)

References (7 rows)

Name	Order	Citations	PageRank
Nana Hou	1	2	2.42
Chenglin Xu	2	20	8.30
Eng Siong Chng	3	970	106.33
Haizhou Li	4	3678	334.61

1