Title | ||
---|---|---|
ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization |
Abstract | ||
---|---|---|
Quantization is a technique to reduce the computation and memory cost of DNN models, which are getting increasingly large. Existing quantization solutions use fixed-point integer or floating-point types, which have limited benefits, as both require more bits to maintain the accuracy of original models. On the other hand, variable-length quantization uses low-bit quantization for normal values and high-precision for a fraction of outlier values. Even though this line of work brings algorithmic benefits, it also introduces significant hardware overheads due to variable-length encoding and decoding.In this work, we propose a fixed-length a daptive n umerical data t ype called ANT to achieve low-bit quantization with tiny hardware overheads. Our data type ANT leverages two key innovations to exploit the intra-tensor and inter-tensor adaptive opportunities in DNN models. First, we propose a particular data type, flint, that combines the advantages of float and int for adapting to the importance of different values within a tensor. Second, we propose an adaptive framework that selects the best type for each tensor according to its distribution characteristics. We design a unified processing element architecture for ANT and show its ease of integration with existing DNN accelerators. Our design results in $2.8\times $ speedup and $2.5\times $ energy efficiency improvement over the state-of-the-art quantization accelerators. |
Year | DOI | Venue |
---|---|---|
2022 | 10.1109/MICRO56248.2022.00095 | 2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO) |
Keywords | DocType | ISBN |
Deep Neural Network,Quantization,Adaptive Numerical Data Type | Conference | 978-1-6654-7428-3 |
Citations | PageRank | References |
0 | 0.34 | 41 |
Authors | ||
8 |
Name | Order | Citations | PageRank |
---|---|---|---|
Cong Guo | 1 | 3 | 1.05 |
Chen Zhang | 2 | 603 | 26.75 |
Jingwen Leng | 3 | 49 | 12.97 |
Zihan Liu | 4 | 0 | 0.34 |
Fan Yang | 5 | 0 | 0.34 |
Yunxin Liu | 6 | 694 | 54.18 |
Minyi Guo | 7 | 3969 | 332.25 |
Yuhao Zhu | 8 | 242 | 23.06 |