Title
Feature Map Alignment - Towards Efficient Design of Mixed-precision Quantization Scheme.
Abstract
Quantization is known as an effective compression method for deploying neural networks on mobile devices. However, most existing works train from scratch a quantized network with universal bitwidth for all layers, making it hard to find the optimal trade-off between compression ratio and inference accuracy. In this paper, we propose a novel post-training quantization approach which derives a flexible bitwidth scheme. Our algorithm progressively downgrades bitwidth of chosen layer in the network and performs feature map alignment with pre-trained model. The algorithm comprises a meter of layer sensitivity and an iterative quantizer. Specifically, the meter dynamically estimates for every layer the error of quantization on its output feature map, meanwhile the error serves as an objective function to be minimized by the quantizer. Extensive experiments on CIFAR-10 and ImageNet ILSVRC2012 datasets demonstrate that the proposed approach achieves impressive results for mainstream neural networks.
Year
DOI
Venue
2019
10.1109/VCIP47243.2019.8965724
VCIP
Field
DocType
Citations 
Computer vision,Mixed precision,Scratch,Inference,Computer science,Algorithm,Compression ratio,Mobile device,Quantization (physics),Artificial intelligence,Artificial neural network,Quantization (signal processing)
Conference
0
PageRank 
References 
Authors
0.34
0
3
Name
Order
Citations
PageRank
Yukun Bao100.34
Yuhui Xu2125.00
Hongkai Xiong351282.84