Cursor-based Adaptive Quantization for Deep Convolutional Neural Network - Citegraph

Paper Info

Title
Cursor-based Adaptive Quantization for Deep Convolutional Neural Network

Abstract
Recent years have witnessed wide applications of deep convolutional neural network (DCNN) in different scenarios. However, its large computational cost and memory consumption seem to be barriers to computing restrained applications. Model quantization is a common method to reduce the storage and computation burden by decreasing the bit width. In this work, we propose a novel cursor based adaptive quantization method using differentiable architecture search (DAS). The multiple bits' quantization mechanism is formulated as a DAS process with a continuous cursor that represents the quantization bit width. The cursor-based DAS adaptively searches for the desired quantization bit width for each layer. The DAS process can be solved via an alternating approximate optimization process. We further devise a new loss function in the search process to collaboratively optimize accuracy and parameter size of the model. In the quantization step, based on a new strategy, the closest two integers to the cursor are adopted as the bits to quantize the DCNN together to reduce the quantization noise and avoid the local convergence problem. Comprehensive experiments on benchmark datasets show that our cursor based adaptive quantization approach can efficiently obtain lower size model with comparable or even better classification accuracy.

Year	DOI	Venue
2021	10.1109/IJCNN52387.2021.9533578	2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)
Keywords	DocType	ISSN
Model Compression, Quantization, Deep Neural Network	Conference	2161-4393
Citations	PageRank	References
0	0.34	0
Authors
5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Baopu Li	1	348	30.88
Yanwen Fan	2	1	1.41
Zhihong Pan	3	3	2.80
Zhiyu Cheng	4	0	0.34
Gang Zhang	5	2	3.58

1