Title
Dataflow-Based Joint Quantization for Deep Neural Networks
Abstract
This paper addresses a challenging problem - how to reduce energy consumption without incurring performance drop when deploying deep neural networks (DNNs) at the inference stage. In order to alleviate the computation and storage burdens, we propose a novel dataflow-based joint quantization approach with the hypothesis that a fewer number of quantization operations would incur less information loss and thus improve the final performance. It first introduces a quantization scheme with efficient bit-shifting and rounding operations to represent network parameters and activations in low precision. Then it re-structures the network architectures to form unified modules for optimization on the quantized model. Extensive experiments on ImageNet and KITTI validate the effectiveness of our model, demonstrating that state-of-the-art results for various tasks can be achieved by this quantized model. Besides, we designed and synthesized an RTL model to measure the hardware costs among various quantization methods. For each quantization operation, it reduces area cost by about 15 times and energy consumption by about 9 times, compared to a strong baseline.
Year
DOI
Venue
2019
10.1109/DCC.2019.00086
2019 Data Compression Conference (DCC)
Keywords
Field
DocType
Quantization,Deep Neural Networks,Dataflow,Power,Area
Computer science,Network architecture,Theoretical computer science,Rounding,Dataflow,Artificial intelligence,Quantization (physics),Deep learning,Quantization (signal processing),Computer engineering,Energy consumption,Computation
Conference
ISSN
ISBN
Citations 
1068-0314
978-1-7281-0658-8
0
PageRank 
References 
Authors
0.34
1
7
Name
Order
Citations
PageRank
Xue Geng100.34
Jie Fu2537.71
Bin Zhao39523.81
Jie Lin43495502.80
Mohamed M. Sabry Aly512.71
Chris Pal62140106.53
Vijay Chandrasekhar719122.83