Abstract | ||
---|---|---|
We propose an end-to-end speaker verification system based on the neural network and trained by a loss function with less computational complexity. The end-to-end speaker verification system in this paper consists of a ResNet architecture to extract features from utterance, then produces utterance-level speaker embeddings, and train using the large-margin Gaussian Mixture loss function. Influenced by the large-margin and likelihood regularization, large-margin Gaussian Mixture loss function benefits the speaker verification performance. Experimental results demonstrate that the Residual CNN with large-margin Gaussian Mixture loss outperforms DNN-based i-vector baseline by more than 10% improvement in accuracy rate. |
Year | Venue | Keywords |
---|---|---|
2018 | 2018 IEEE 23rd International Conference on Digital Signal Processing (DSP) | Feature extraction,Training,Task analysis,Neural networks,Hidden Markov models,Loss measurement,Adaptation models |
DocType | Volume | Citations |
Conference | abs/1805.00645 | 0 |
PageRank | References | Authors |
0.34 | 0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Xuan Shi | 1 | 29 | 6.72 |
Mengyao Zhu | 2 | 0 | 1.35 |
Xingjian Du | 3 | 1 | 3.39 |