Contrastive Learning for improving End-to-end Speaker Verification - Citegraph

Paper Info

Title
Contrastive Learning for improving End-to-end Speaker Verification

Abstract
Speaker verification involves examining the speech signal to authenticate the claim of a speaker as true or false. Deep neural networks are one of the successful implementations of complex non-linear models to learn unique and invariant features of data. They have been employed in speech recognition tasks and have shown their potential to be used for speaker recognition also. However, the overfitting problem is remained to prevent the model's performance. In this study, we apply contrastive learning on speaker verification tasks to solve the robustness problem. Besides, we introduce domain adaptive loss on the tasks. Experimental results and ablation study that indicate that our proposed model outperforms various baseline end-to-end methods significantly by at least relative 10%, including d-vector approaches, deep-speaker, and generalized end-to-end model, for text-dependent speaker verification on a company's internal text-dependent voice command DataSet.

Year	DOI	Venue
2021	10.1109/IJCNN52387.2021.9533489	2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)
Keywords	DocType	ISSN
End-to-end speaker verification, Contrastive learning, Domain adaptive loss, text-dependent	Conference	2161-4393
Citations	PageRank	References
0	0.34	0
Authors
4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Yanxi Tang	1	0	0.34
Jianzong Wang	2	61	34.65
Xiaoyang Qu	3	0	1.35
Jing Xiao	4	7	5.78

1