Abstract | ||
---|---|---|
Deep speaker embedding learning is an effective method for speaker identity modelling. Very deep models such as ResNet can achieve remarkable results but are usually too computationally expensive for real applications with limited resources. On the other hand, simply reducing model size is likely to result in significant performance degradation. In this paper, label-level and embedding-level knowledge distillation are proposed to narrow down the performance gap between large and small models. Label-level distillation utilizes the posteriors obtained by a well-trained teacher model to guide the optimization of the student model, while embedding-level distillation directly constrains the similarity between embeddings learned by two models. Experiments were carried out on the VoxCeleb1 dataset. Results show that the proposed knowledge distillation methods can significantly boost the performance of highly compact student models. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/icassp.2019.8683443 | 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) |
Keywords | Field | DocType |
knowledge distillation, teacher-student learning, speaker verification, speaker embedding | Speaker verification,Embedding,Small foot,Pattern recognition,Effective method,Computer science,Distillation,Artificial intelligence,Residual neural network,Performance gap,Machine learning | Conference |
ISSN | Citations | PageRank |
1520-6149 | 0 | 0.34 |
References | Authors | |
0 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Shuai Wang | 1 | 4 | 1.85 |
Yexin Yang | 2 | 1 | 2.04 |
Tianzhe Wang | 3 | 10 | 1.79 |
Yanmin Qian | 4 | 295 | 44.44 |
Kai Yu | 5 | 1082 | 90.58 |