Title
Quantization of Deep Neural Networks for Accurate Edge Computing
Abstract
AbstractDeep neural networks have demonstrated their great potential in recent years, exceeding the performance of human experts in a wide range of applications. Due to their large sizes, however, compression techniques such as weight quantization and pruning are usually applied before they can be accommodated on the edge. It is generally believed that quantization leads to performance degradation, and plenty of existing works have explored quantization strategies aiming at minimum accuracy loss. In this paper, we argue that quantization, which essentially imposes regularization on weight representations, can sometimes help to improve accuracy. We conduct comprehensive experiments on three widely used applications: fully connected network for biomedical image segmentation, convolutional neural network for image classification on ImageNet, and recurrent neural network for automatic speech recognition, and experimental results show that quantization can improve the accuracy by 1%, 1.95%, 4.23% on the three applications respectively with 3.5x-6.4x memory reduction.
Year
DOI
Venue
2021
10.1145/3451211
ACM Journal on Emerging Technologies in Computing Systems
Keywords
DocType
Volume
Edge computing, deep neural networks, quantization
Journal
17
Issue
ISSN
Citations 
4
1550-4832
0
PageRank 
References 
Authors
0.34
0
10
Name
Order
Citations
PageRank
Wentao Chen100.34
Hailong Qiu232.12
Jian Zhuang393.34
Chutong Zhang400.34
Yu Hu591.61
Qing Lu6132.73
Tianchen Wang7208.02
Yiyu Shi855383.22
Meiping Huang924.79
Xiaowei Xu1000.34