Title
Mutual-learning sequence-level knowledge distillation for automatic speech recognition
Abstract
Automatic speech recognition (ASR) is a crucial technology for man-machine interaction. End-to-end models have been studied recently in deep learning for ASR. However, these models are not suitable for the practical application of ASR due to their large model sizes and computation costs. To address this issue, we propose a novel mutual-learning sequence-level knowledge distillation framework enjoying distinct student structures for ASR. Trained mutually and simultaneously, each student learns not only from the pre-trained teacher but also from its distinct peers, which can improve the generalization capability of the whole network, through making up for the insufficiency of each student and bridging the gap between each student and the teacher. Extensive experiments on the TIMIT and large LibriSpeech corpuses show that, compared with the state-of-the-art methods, the proposed method achieves an excellent balance between recognition accuracy and model compression.
Year
DOI
Venue
2021
10.1016/j.neucom.2020.11.025
Neurocomputing
Keywords
DocType
Volume
Automatic speech recognition (ASR),Model compression,Knowledge distillation (KD),Mutual learning,Connectionist temporal classification (CTC)
Journal
428
ISSN
Citations 
PageRank 
0925-2312
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Zerui Li100.34
Yue Ming2458.83
Dongkai Yang34022.91
Jing-Hao Xue41510.05