Title
Post Training Weight Compression with Distribution-based Filter-wise Quantization Step
Abstract
Quantization of models with lower bit precision is a promising method to develop lower-power and smaller-area neural network hardware. However, 4- or lower bit quantization usually requires additional retraining with labeled dataset for backpropagation to improve test accuracy. In this paper, we propose a quantization scheme with distribution-based filter-wise quantization step without labeled dataset. ResNet-50 model with 8-bit activation and 3.04-bit weight precision quantized with the proposed techniques achieves top-1 inference accuracy of 74.3 % on ImageNet.
Year
DOI
Venue
2019
10.1109/CoolChips.2019.8721356
2019 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS)
Keywords
Field
DocType
Quantization (signal),Tuning,Training,Hardware,Backpropagation,Energy efficiency,Filtering algorithms
Compression (physics),Algorithm,Quantization (signal processing),Mathematics
Conference
ISSN
ISBN
Citations 
2473-4683
978-1-7281-1749-2
1
PageRank 
References 
Authors
0.35
0
4
Name
Order
Citations
PageRank
Shinichi Sasaki120.82
Asuka Maki221.86
Daisuke Miyashita3729.99
Jun Deguchi412.38