Title
NPAS: A Compiler-aware Framework of Unified Network Pruning and Architecture Search for Beyond Real-Time Mobile Acceleration
Abstract
With the increasing demand to efficiently deploy DNNs on mobile edge devices, it becomes much more important to reduce unnecessary computation and increase the execution speed. Prior methods towards this goal, including model compression and network architecture search (NAS), are largely performed independently, and do not fully consider compiler-level optimizations which is a must-do for mobile acceleration. In this work, we first propose (i) a general category of fine-grained structured pruning applicable to various DNN layers, and (ii) a comprehensive, compiler automatic code generation framework supporting different DNNs and different pruning schemes, which bridge the gap of model compression and NAS. We further propose NPAS, a compiler-aware unified network pruning and architecture search. To deal with large search space, we propose a meta-modeling procedure based on reinforcement learning with fast evaluation and Bayesian optimization, ensuring the total number of training epochs comparable with representative NAS frameworks. Our framework achieves 6.7ms, 5.9ms, and 3.9ms ImageNet inference times with 78.2%, 75% (MobileNet-V3 level), and 71% (MobileNet-V2 level) Top-1 accuracy respectively on an off-the-shelf mobile phone, consistently outperforming prior work.
Year
DOI
Venue
2021
10.1109/CVPR46437.2021.01403
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021
DocType
ISSN
Citations 
Conference
1063-6919
1
PageRank 
References 
Authors
0.35
0
16
Name
Order
Citations
PageRank
Zhengang Li1157.27
Geng Yuan273.56
Wei Niu32411.21
Pu Zhao43211.73
Yanyu Li530.84
Yuxuan Cai611.37
Xuan Shen711.70
Zhan Zheng854.59
Zhenglun Kong942.77
Qing Jin1022.11
Zhiyu Chen1181.59
Sijia Liu1218142.37
Kuiyuan Yang1314820.89
Bin Ren148218.03
Yanzhi Wang151082136.11
Xue Lin1610.35