Title | ||
---|---|---|
Post Training Weight Compression with Distribution-based Filter-wise Quantization Step |
Abstract | ||
---|---|---|
Quantization of models with lower bit precision is a promising method to develop lower-power and smaller-area neural network hardware. However, 4- or lower bit quantization usually requires additional retraining with labeled dataset for backpropagation to improve test accuracy. In this paper, we propose a quantization scheme with distribution-based filter-wise quantization step without labeled dataset. ResNet-50 model with 8-bit activation and 3.04-bit weight precision quantized with the proposed techniques achieves top-1 inference accuracy of 74.3 % on ImageNet. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/CoolChips.2019.8721356 | 2019 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS) |
Keywords | Field | DocType |
Quantization (signal),Tuning,Training,Hardware,Backpropagation,Energy efficiency,Filtering algorithms | Compression (physics),Algorithm,Quantization (signal processing),Mathematics | Conference |
ISSN | ISBN | Citations |
2473-4683 | 978-1-7281-1749-2 | 1 |
PageRank | References | Authors |
0.35 | 0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Shinichi Sasaki | 1 | 2 | 0.82 |
Asuka Maki | 2 | 2 | 1.86 |
Daisuke Miyashita | 3 | 72 | 9.99 |
Jun Deguchi | 4 | 1 | 2.38 |