Mixed Precision Quantization for ReRAM-based DNN Inference Accelerators - Citegraph

Paper Info

Title
Mixed Precision Quantization for ReRAM-based DNN Inference Accelerators

Abstract
ReRAM-based accelerators have shown great potential for accelerating DNN inference because ReRAM crossbars can perform analog matrix-vector multiplication operations with low latency and energy consumption. However, these crossbars require the use of ADCs which constitute a significant fraction of the cost of MVM operations. The overhead of ADCs can be mitigated via partial sum quantization. However, prior quantization flows for DNN inference accelerators do not consider partial sum quantization which is not highly relevant to traditional digital architectures. To address this issue, we propose a mixed precision quantization scheme for ReRAM-based DNN inference accelerators where weight quantization, input quantization, and partial sum quantization are jointly applied for each DNN layer. We also propose an automated quantization flow powered by deep reinforcement learning to search for the best quantization configuration in the large design space. Our evaluation shows that the proposed mixed precision quantization scheme and quantization flow reduce inference latency and energy consumption by up to 3.89× and 4.84×, respectively, while only losing 1.18% in DNN inference accuracy.

Year	DOI	Venue
2021	10.1145/3394885.3431554	2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC)
Keywords	DocType	ISSN
Mixed precision quantization,ReRAM,DNN inference accelerators	Conference	2153-6961
ISBN	Citations	PageRank
978-1-7281-8057-1	2	0.37
References	Authors
0	18

Authors (18 rows)

Cited by (2 rows)

References (0 rows)

Name	Order	Citations	PageRank
Sitao Huang	1	81	9.68
Aayush Ankit	2	78	11.75
Plinio Silveira	3	2	0.37
Rodrigo Antunes	4	2	0.37
Sai Rahul Chalamalasetti	5	136	16.33
Izzat El Hajj	6	79	6.91
Dong Eun Kim	7	9	1.53
Glaucimar Aguiar	8	2	0.37
Pedro Bruel	9	2	0.37
Sergey Serebryakov	10	2	2.06
Cong Xu	11	1154	48.25
Can Li	12	2	0.71
Paolo Faraboschi	13	974	81.37
John Paul Strachan	14	280	17.84
Deming Chen	15	1432	127.66
Kaushik Roy	16	239	20.51
Wen-mei W. Hwu	17	4322	511.62
Dejan S. Milojicic	18	249	31.80

1