Dataflow-based Joint Quantization of Weights and Activations for Deep Neural Networks. - Citegraph

Paper Info

Title
Dataflow-based Joint Quantization of Weights and Activations for Deep Neural Networks.

Abstract
This paper addresses a challenging problem - how to reduce energy consumption without incurring performance drop when deploying deep neural networks (DNNs) at the inference stage. In order to alleviate the computation and storage burdens, we propose a novel dataflow-based joint quantization approach with the hypothesis that a fewer number of quantization operations would incur less information loss and thus improve the final performance. It first introduces a quantization scheme with efficient bit-shifting and rounding operations to represent network parameters and activations in low precision. Then it restructures the network architectures to form unified modules for optimization on the quantized model. Extensive experiments on ImageNet and KITTI validate the effectiveness of our model, demonstrating that state-of-the-art results for various tasks can be achieved by this quantized model. Besides, we designed and synthesized an RTL model to measure the hardware costs among various quantization methods. For each quantization operation, it reduces area cost by about 15 times and energy consumption by about 9 times, compared to a strong baseline.

Year	Venue	DocType
2019	arXiv: Learning	Journal
Volume	ISSN	Citations
abs/1901.02064	Data Compression Conference 2019	1
PageRank	References	Authors
0.37	15	7

Authors (7 rows)

Cited by (1 rows)

References (15 rows)

Name	Order	Citations	PageRank
Xue Geng	1	1	0.37
Jie Fu	2	53	7.71
Bin Zhao	3	95	23.81
Jie Lin	4	3495	502.80
Mohamed M. Sabry Aly	5	1	0.37
Chris Pal	6	2140	106.53
Vijay Chandrasekhar	7	191	22.83

1