Abstract | ||
---|---|---|
Increasing demand for on-device Automatic Speech Recognition (ASR) systems has resulted in renewed interests in developing automatic model compression techniques. Past research have shown that AutoML-based Low Rank Factorization (LRF) technique, when applied to an end-to-end Encoder-Attention-Decoder style ASR model, can achieve a speedup of up to 3.7x, outperforming laborious manual rank-selection approaches. However, we show that current AutoML-based search techniques only work up to a certain compression level, beyond which they fail to produce compressed models with acceptable word error rates (WER). In this work, we propose an iterative AutoML-based LRF approach that achieves over 5x compression without degrading the WER, thereby advancing the state-of-the-art in ASR compression. |
Year | DOI | Venue |
---|---|---|
2020 | 10.21437/Interspeech.2020-1894 | INTERSPEECH |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
0 | 12 |
Name | Order | Citations | PageRank |
---|---|---|---|
Abhinav Mehrotra | 1 | 0 | 1.01 |
Łukasz Dudziak | 2 | 17 | 4.37 |
Jinsu Yeo | 3 | 0 | 0.34 |
Young-yoon Lee | 4 | 0 | 0.34 |
Ravichander Vipperla | 5 | 8 | 2.20 |
Mohamed S. Abdelfattah | 6 | 144 | 13.65 |
Sourav Bhattacharya | 7 | 624 | 52.45 |
Samin Ishtiaq | 8 | 0 | 1.69 |
Alberto Gil C. P. Ramos | 9 | 0 | 0.68 |
SangJeong Lee | 10 | 0 | 0.34 |
Daehyun Kim | 11 | 0 | 0.34 |
Nicholas D. Lane | 12 | 4247 | 248.15 |