Title
CompAct: On-chip Com pression of Act ivations for Low Power Systolic Array Based CNN Acceleration
Abstract
This paper addresses the design of systolic array (SA) based convolutional neural network (CNN) accelerators for mobile and embedded domains. On- and off-chip memory accesses to the large activation inputs (sometimes called feature maps) of CNN layers contribute significantly to total energy consumption for such accelerators; while prior has proposed off-chip compression, activations are still stored on-chip in uncompressed form, requiring either large on-chip activation buffers or slow and energy-hungry off-chip accesses. In this paper, we propose CompAct, a new architecture that enables on-chip compression of activations for SA based CNN accelerators. CompAct is built around several key ideas. First, CompAct identifies an SA schedule that has nearly regular access patterns, enabling the use of a modified run-length coding scheme (RLC). Second, CompAct improves compression ratio of the RLC scheme using Sparse-RLC in later CNN layers and Lossy-RLC in earlier layers. Finally, CompAct proposes look-ahead snoozing that operates synergistically with RLC to reduce the leakage energy of activation buffers. Based on detailed synthesis results, we show that CompAct enables up to 62% reduction in activation buffer energy, and 34% reduction in total chip energy.
Year
DOI
Venue
2019
10.1145/3358178
ACM Transactions on Embedded Computing Systems (TECS)
Keywords
DocType
Volume
Deep neural networks, low-power design, systolic arrays
Journal
18
Issue
ISSN
Citations 
5s
1539-9087
0
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
Jeff Jun Zhang1121.90
Parul Raj200.34
Shuayb Zarar302.70
Amol Ambardekar400.34
Siddharth Garg567555.14