Title
RTMobile: Beyond Real-Time Mobile Acceleration of RNNs for Speech Recognition
Abstract
Recurrent neural networks (RNNs) based automatic speech recognition has nowadays become promising and important on mobile devices such as smart phones. However, previous RNN compression techniques either suffer from hardware performance overhead due to irregularity or significant accuracy loss due to the preserved regularity for hardware friendliness. In this work, we propose RTMobile that leverages both a novel block-based pruning approach and compiler optimizations to accelerate RNN inference on mobile devices. Our proposed RTMobile is the first work that can achieve real-time RNN inference on mobile platforms. Experimental results demonstrate that RTMobile can significantly outperform existing RNN hardware acceleration methods in terms of both inference accuracy and time. Compared with prior work on FPGA, RTMobile using Adreno 640 embedded GPU on GRU can improve the energy-efficiency by 40× while maintaining the same inference time.
Year
DOI
Venue
2020
10.1109/DAC18072.2020.9218499
2020 57th ACM/IEEE Design Automation Conference (DAC)
Keywords
DocType
ISSN
RNN,pruning,real-time acceleration,mobile
Conference
0738-100X
ISBN
Citations 
PageRank 
978-1-7281-1085-1
2
0.36
References 
Authors
0
10
Name
Order
Citations
PageRank
Dong Peiyan143.12
Siyue Wang2213.78
Wei Niu32411.21
Chengming Zhang453.10
Sheng Lin513914.39
Zhengang Li6157.27
Yifan Gong71332135.58
Bin Ren88218.03
Xue Lin98614.97
Dingwen Tao1012917.66