Title
Automatic Joint Optimization of Algorithm-Level Compression and Compiler-Based Acceleration with Reinforcement Learning for DNN in Edge Devices
Abstract
More accurate machine learning models often require more memory cost and more software-hardware co-adaption efforts for deployments on resource-constrained devices. Model compression techniques and deep learning compiler are developed to reduce the memory cost and latency. However, current methods require tremendous engineering efforts to optimize the model manually. This paper introduces a jointly learning based framework to perform the compression task and the acceleration task simultaneously. The joint optimization method auto-tunes the algorithm-level compression and compiler-based acceleration with reinforcement learning. The experiment results demonstrate that we compress the model by a factor of 2 or 8, and accelerate the optimization up to 30 times using our learning framework.
Year
DOI
Venue
2021
10.1109/IJCNN52387.2021.9533729
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)
Keywords
DocType
ISSN
model compression, inference acceleration, quantization, reinforcement learning
Conference
2161-4393
Citations 
PageRank 
References 
0
0.34
0
Authors
6
Name
Order
Citations
PageRank
Jie Liu100.68
Jianzong Wang26134.65
Xiaoyang Qu301.35
Bo Chen400.68
Zihang Wei500.68
Jing Xiao675.78