Quantization of Deep Neural Networks for Accurate Edge Computing - Citegraph

Paper Info

Title
Quantization of Deep Neural Networks for Accurate Edge Computing

Abstract
AbstractDeep neural networks have demonstrated their great potential in recent years, exceeding the performance of human experts in a wide range of applications. Due to their large sizes, however, compression techniques such as weight quantization and pruning are usually applied before they can be accommodated on the edge. It is generally believed that quantization leads to performance degradation, and plenty of existing works have explored quantization strategies aiming at minimum accuracy loss. In this paper, we argue that quantization, which essentially imposes regularization on weight representations, can sometimes help to improve accuracy. We conduct comprehensive experiments on three widely used applications: fully connected network for biomedical image segmentation, convolutional neural network for image classification on ImageNet, and recurrent neural network for automatic speech recognition, and experimental results show that quantization can improve the accuracy by 1%, 1.95%, 4.23% on the three applications respectively with 3.5x-6.4x memory reduction.

Year	DOI	Venue
2021	10.1145/3451211	ACM Journal on Emerging Technologies in Computing Systems
Keywords	DocType	Volume
Edge computing, deep neural networks, quantization	Journal	17
Issue	ISSN	Citations
4	1550-4832	0
PageRank	References	Authors
0.34	0	10

Authors (10 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Wentao Chen	1	0	0.34
Hailong Qiu	2	3	2.12
Jian Zhuang	3	9	3.34
Chutong Zhang	4	0	0.34
Yu Hu	5	9	1.61
Qing Lu	6	13	2.73
Tianchen Wang	7	20	8.02
Yiyu Shi	8	553	83.22
Meiping Huang	9	2	4.79
Xiaowei Xu	10	0	0.34

1