Sparse ReRAM engine: joint exploration of activation and weight sparsity in compressed neural networks - Citegraph

Paper Info

Title
Sparse ReRAM engine: joint exploration of activation and weight sparsity in compressed neural networks

Abstract
Exploiting model sparsity to reduce ineffectual computation is a commonly used approach to achieve energy efficiency for DNN inference accelerators. However, due to the tightly coupled crossbar structure, exploiting sparsity for ReRAM-based NN accelerator is a less explored area. Existing architectural studies on ReRAM-based NN accelerators assume that an entire crossbar array can be activated in a single cycle. However, due to inference accuracy considerations, matrix-vector computation must be conducted in a smaller granularity in practice, called Operation Unit (OU). An OU-based architecture creates a new opportunity to exploit DNN sparsity. In this paper, we propose the first practical Sparse ReRAM Engine that exploits both weight and activation sparsity. Our evaluation shows that the proposed method is effective in eliminating ineffectual computation, and delivers significant performance improvement and energy savings.

Year	DOI	Venue
2019	10.1145/3307650.3322271	Proceedings of the 46th International Symposium on Computer Architecture
Keywords	Field	DocType
ReRAM, accelerator architecture, neural network, sparsity	Computer science,Inference,Efficient energy use,Parallel computing,Granularity,Artificial neural network,Crossbar switch,Computation,Performance improvement,Resistive random-access memory	Conference
ISSN	ISBN	Citations
1063-6897	978-1-4503-6669-4	13
PageRank	References	Authors
0.54	16	7

Authors (7 rows)

Cited by (13 rows)

References (16 rows)

Name	Order	Citations	PageRank
Tzu-Hsien Yang	1	22	4.49
Hsiang-Yun Cheng	2	61	6.07
Chia-Lin Yang	3	1033	76.39
I-Ching Tseng	4	19	1.03
Han-Wen Hu	5	19	1.37
Hung-Sheng Chang	6	36	3.30
Hsiang-Pang Li	7	123	9.54

1