Title
Swallow: A Versatile Accelerator for Sparse Neural Networks
Abstract
Sparse neural networks (SNNs) are emerging as a promising technique for resource-limited intelligent embedded systems because of the compact model size and the un compromised accuracy. Recently, most of the dedicated neural network accelerators are beginning to exploit the sparsity of neural network models for performance boost and energy saving. However, existing sparsity-aware accelerators fail to support both sparse weights and activations in neural networks or support them at the same time for both convolutional (Conv) layers and fully connected (FC) layers, which dominate the computational time of neural networks. In this article, we propose a novel sparsity-aware accelerator architecture, called Swallow, to sufficiently improve the inference performance by eliminating ineffectual weights and activations of neural networks. Swallow comprises: 1) a 2-D systolic architecture that fully utilizes the sparsity of both weights and activations in both Conv and FC layers and 2) a sparsity-aware dataflow which is optimized to reuse both weights and activations and to achieve high processing element (PE) utilization by sparse matrix multiplication tiling. Comprehensive evaluations based on a place-and-route process show that Swallow, with 614 GOP/s peak performance and 1.26-W power, outperforms a state-of-the-art sparsity-aware accelerator Cambricon-X by 1.32× in term of energy efficiency.
Year
DOI
Venue
2020
10.1109/TCAD.2020.2978836
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Keywords
DocType
Volume
Accelerator,convolutional (Conv) layers,fully connected (FC) layers,sparse neural networks (SNNs)
Journal
39
Issue
ISSN
Citations 
12
0278-0070
2
PageRank 
References 
Authors
0.40
0
4
Name
Order
Citations
PageRank
Bosheng Liu1323.44
Xiaoming Chen24313.67
Yinhe Han366667.18
Haobo Xu4133.04