Title
Contrastive Learning for improving End-to-end Speaker Verification
Abstract
Speaker verification involves examining the speech signal to authenticate the claim of a speaker as true or false. Deep neural networks are one of the successful implementations of complex non-linear models to learn unique and invariant features of data. They have been employed in speech recognition tasks and have shown their potential to be used for speaker recognition also. However, the overfitting problem is remained to prevent the model's performance. In this study, we apply contrastive learning on speaker verification tasks to solve the robustness problem. Besides, we introduce domain adaptive loss on the tasks. Experimental results and ablation study that indicate that our proposed model outperforms various baseline end-to-end methods significantly by at least relative 10%, including d-vector approaches, deep-speaker, and generalized end-to-end model, for text-dependent speaker verification on a company's internal text-dependent voice command DataSet.
Year
DOI
Venue
2021
10.1109/IJCNN52387.2021.9533489
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)
Keywords
DocType
ISSN
End-to-end speaker verification, Contrastive learning, Domain adaptive loss, text-dependent
Conference
2161-4393
Citations 
PageRank 
References 
0
0.34
0
Authors
4
Name
Order
Citations
PageRank
Yanxi Tang100.34
Jianzong Wang26134.65
Xiaoyang Qu301.35
Jing Xiao475.78