Abstract | ||
---|---|---|
In order to improve the performance for far-field speech recognition, this paper proposes to distill knowledge from the close-talking model to the far-field model using parallel data. The close-talking model is called the teacher model. The far-field model is called the student model. The student model is trained to imitate the output distributions of the teacher model. This constraint can be realized by minimizing the Kullback-Leibler (KL) divergence between the output distribution of the student model and the teacher model. Experimental results on AMI corpus show that the best student model achieves up to 4.7% absolute word error rate (WER) reduction when compared with the conventionally-trained baseline models. |
Year | Venue | Field |
---|---|---|
2018 | arXiv: Computation and Language | Computer science,Word error rate,Near and far field,Speech recognition,Natural language processing,Artificial intelligence |
DocType | Volume | Citations |
Journal | abs/1802.06941 | 0 |
PageRank | References | Authors |
0.34 | 14 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jiangyan Yi | 1 | 19 | 17.99 |
Jianhua Tao | 2 | 848 | 138.00 |
Zhengqi Wen | 3 | 86 | 24.41 |
Bin Liu | 4 | 5 | 2.45 |