Abstract | ||
---|---|---|
We present APQ, a novel design methodology for efficient deep learning deployment. Unlike previous methods that separately optimize the neural network architecture, pruning policy, and quantization policy, we design to optimize them in a joint manner. To deal with the larger design space it brings, we devise to train a quantization-aware accuracy predictor that is fed to the evolutionary search to select the best fit. Since directly training such a predictor requires time-consuming quantization data collection, we propose to use predictor-transfer technique to get the quantization-aware predictor: we first generate a large dataset of (NN architecture, ImageNet accuracy) pairs by sampling a pretrained unified once-for-all network and doing direct evaluation; then we use these data to train an accuracy predictor without quantization, followed by transferring its weights to train the quantization-aware predictor, which largely reduces the quantization data collection time. Extensive experiments on ImageNet show the benefits of this joint design methodology: the model searched by our method maintains the same level accuracy as ResNet34 8-bit model while saving 8x BitOps; we achieve 2x/1.3x latency/energy saving compared to MobileNetV2+HAQ [30, 36] while obtaining the same level accuracy; the marginal search cost of joint optimization for a new deployment scenario outperforms separate optimizations using ProxylessNAS+AMC+HAQ [5, 12, 36] by 2.3% accuracy while reducing orders of magnitude GPU hours and CO2 emission with respect to the training cost. |
Year | DOI | Venue |
---|---|---|
2020 | 10.1109/CVPR42600.2020.00215 | 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) |
DocType | ISSN | Citations |
Conference | 1063-6919 | 4 |
PageRank | References | Authors |
0.39 | 27 | 8 |
Name | Order | Citations | PageRank |
---|---|---|---|
Tianzhe Wang | 1 | 10 | 1.79 |
Kuan Wang | 2 | 45 | 3.06 |
Han Cai | 3 | 223 | 10.39 |
Lin, Ji | 4 | 79 | 8.18 |
Zhijian Liu | 5 | 59 | 9.80 |
Hanrui Wang | 6 | 36 | 5.63 |
Yujun Lin | 7 | 101 | 7.03 |
Song Han | 8 | 2102 | 79.81 |