Title | ||
---|---|---|
Feature Map Alignment - Towards Efficient Design of Mixed-precision Quantization Scheme. |
Abstract | ||
---|---|---|
Quantization is known as an effective compression method for deploying neural networks on mobile devices. However, most existing works train from scratch a quantized network with universal bitwidth for all layers, making it hard to find the optimal trade-off between compression ratio and inference accuracy. In this paper, we propose a novel post-training quantization approach which derives a flexible bitwidth scheme. Our algorithm progressively downgrades bitwidth of chosen layer in the network and performs feature map alignment with pre-trained model. The algorithm comprises a meter of layer sensitivity and an iterative quantizer. Specifically, the meter dynamically estimates for every layer the error of quantization on its output feature map, meanwhile the error serves as an objective function to be minimized by the quantizer. Extensive experiments on CIFAR-10 and ImageNet ILSVRC2012 datasets demonstrate that the proposed approach achieves impressive results for mainstream neural networks. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/VCIP47243.2019.8965724 | VCIP |
Field | DocType | Citations |
Computer vision,Mixed precision,Scratch,Inference,Computer science,Algorithm,Compression ratio,Mobile device,Quantization (physics),Artificial intelligence,Artificial neural network,Quantization (signal processing) | Conference | 0 |
PageRank | References | Authors |
0.34 | 0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yukun Bao | 1 | 0 | 0.34 |
Yuhui Xu | 2 | 12 | 5.00 |
Hongkai Xiong | 3 | 512 | 82.84 |