Title | ||
---|---|---|
Sparse ReRAM engine: joint exploration of activation and weight sparsity in compressed neural networks |
Abstract | ||
---|---|---|
Exploiting model sparsity to reduce ineffectual computation is a commonly used approach to achieve energy efficiency for DNN inference accelerators. However, due to the tightly coupled crossbar structure, exploiting sparsity for ReRAM-based NN accelerator is a less explored area. Existing architectural studies on ReRAM-based NN accelerators assume that an entire crossbar array can be activated in a single cycle. However, due to inference accuracy considerations, matrix-vector computation must be conducted in a smaller granularity in practice, called Operation Unit (OU). An OU-based architecture creates a new opportunity to exploit DNN sparsity. In this paper, we propose the first practical Sparse ReRAM Engine that exploits both weight and activation sparsity. Our evaluation shows that the proposed method is effective in eliminating ineffectual computation, and delivers significant performance improvement and energy savings.
|
Year | DOI | Venue |
---|---|---|
2019 | 10.1145/3307650.3322271 | Proceedings of the 46th International Symposium on Computer Architecture |
Keywords | Field | DocType |
ReRAM, accelerator architecture, neural network, sparsity | Computer science,Inference,Efficient energy use,Parallel computing,Granularity,Artificial neural network,Crossbar switch,Computation,Performance improvement,Resistive random-access memory | Conference |
ISSN | ISBN | Citations |
1063-6897 | 978-1-4503-6669-4 | 13 |
PageRank | References | Authors |
0.54 | 16 | 7 |
Name | Order | Citations | PageRank |
---|---|---|---|
Tzu-Hsien Yang | 1 | 22 | 4.49 |
Hsiang-Yun Cheng | 2 | 61 | 6.07 |
Chia-Lin Yang | 3 | 1033 | 76.39 |
I-Ching Tseng | 4 | 19 | 1.03 |
Han-Wen Hu | 5 | 19 | 1.37 |
Hung-Sheng Chang | 6 | 36 | 3.30 |
Hsiang-Pang Li | 7 | 123 | 9.54 |