Title
Efficient Hardware Accelerator For Compressed Sparse Deep Neural Network
Abstract
This work presents a DNN accelerator architecture specifically designed for performing efficient inference on compressed and sparse DNN models. Leveraging the data sparsity, a runtime processing scheme is proposed to deal with the encoded weights and activations directly in the compressed domain without decompressing. Furthermore, a new data flow is proposed to facilitate the reusage of input activations across the fully-connected (FC) layers. The proposed design is implemented and verified using the Xilinx Virtex-7 FPGA. Experimental results show it achieves 1.99x, 1.95x faster and 20.38x, 3.04x more energy efficient than CPU and mGPU platforms, respectively, running AlexNet.
Year
DOI
Venue
2021
10.1587/transinf.2020EDL8153
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS
Keywords
DocType
Volume
deep neural networks, filed programmable gate array, run-length compression, sparse data
Journal
E104D
Issue
ISSN
Citations 
5
1745-1361
0
PageRank 
References 
Authors
0.34
0
3
Name
Order
Citations
PageRank
Hao Xiao1142.43
Kaikai Zhao200.34
Guangzhu Liu300.34