Title
Extraction of Noise-Robust Speaker Embedding Based on Generative Adversarial Networks
Abstract
In the field of speaker verification, the speaker systems based on x-vector framework are widely used in many scenarios. However, it suffers from the performance degradation caused by noise disturbance. In this paper, we firstly analyzed the noisy robustness of x-vector by training the networks using a mixture dataset which includes clean data and corrupted data. Then, we proposed a novel adversarial strategy against noise interference and extracted the noise-robust speaker embedding with x-vector. The proposed adversarial method named as triplenet GAN employs three connected networks: a generator network (G), a discriminator network (D) and a classifier network (C). The spectral coefficients of clean and noisy speech utterances are fed to the G, of which the structure is nearly the same as x-vector. The outputs of G are transferred in a parallel way to the D and C. And the labels of D are set binary for clean data and corrupted data, while the labels of C are set corresponding to speaker identities, which aims to learn the speaker embedding features invariant to the noise. Finally, we executed the experiments with different variants of triple-net GAN to verify the denoising capability of the proposed adversarial method. Experimental results on Librispeech corpus demonstrate that our proposed method could achieve a better performance under the noisy environments.
Year
DOI
Venue
2019
10.1109/APSIPAASC47483.2019.9023295
Asia-Pacific Signal and Information Processing Association Annual Summit and Conference
Keywords
DocType
ISSN
noise-robust,generative adversarial networks,speaker embedding,speaker verification
Conference
2309-9402
Citations 
PageRank 
References 
0
0.34
0
Authors
4
Name
Order
Citations
PageRank
Jianfeng Zhou151.11
Tao Jiang28719.51
Q. Y. Hong35015.79
Lin Li432379.92