Title
Sparse ReRAM engine: joint exploration of activation and weight sparsity in compressed neural networks
Abstract
Exploiting model sparsity to reduce ineffectual computation is a commonly used approach to achieve energy efficiency for DNN inference accelerators. However, due to the tightly coupled crossbar structure, exploiting sparsity for ReRAM-based NN accelerator is a less explored area. Existing architectural studies on ReRAM-based NN accelerators assume that an entire crossbar array can be activated in a single cycle. However, due to inference accuracy considerations, matrix-vector computation must be conducted in a smaller granularity in practice, called Operation Unit (OU). An OU-based architecture creates a new opportunity to exploit DNN sparsity. In this paper, we propose the first practical Sparse ReRAM Engine that exploits both weight and activation sparsity. Our evaluation shows that the proposed method is effective in eliminating ineffectual computation, and delivers significant performance improvement and energy savings.
Year
DOI
Venue
2019
10.1145/3307650.3322271
Proceedings of the 46th International Symposium on Computer Architecture
Keywords
Field
DocType
ReRAM, accelerator architecture, neural network, sparsity
Computer science,Inference,Efficient energy use,Parallel computing,Granularity,Artificial neural network,Crossbar switch,Computation,Performance improvement,Resistive random-access memory
Conference
ISSN
ISBN
Citations 
1063-6897
978-1-4503-6669-4
13
PageRank 
References 
Authors
0.54
16
7
Name
Order
Citations
PageRank
Tzu-Hsien Yang1224.49
Hsiang-Yun Cheng2616.07
Chia-Lin Yang3103376.39
I-Ching Tseng4191.03
Han-Wen Hu5191.37
Hung-Sheng Chang6363.30
Hsiang-Pang Li71239.54