An Energy-Efficient Accelerator with Relative- Indexing Memory for Sparse Compressed Convolutional Neural Network - Citegraph

Paper Info

Title
An Energy-Efficient Accelerator with Relative- Indexing Memory for Sparse Compressed Convolutional Neural Network

Abstract
Deep convolutional neural networks (CNNs) are widely used in image recognition and feature classification. However, deep CNNs are hard to be fully deployed for edge devices due to both computation-intensive and memory-intensive workloads. The energy efficiency of CNNs is dominated by off-chip memory accesses and convolution computation. In this paper, an energy-efficient accelerator is proposed for sparse compressed CNNs by reducing DRAM accesses and eliminating zero-operand computation. Weight compression is utilized for sparse compressed CNNs to reduce the required memory capacity/bandwidth and a large portion of connections. Thus, ReLU function produces zero-valued activations. Additionally, the workloads are distributed based on channels to increase the degree of task parallelism, and all-row- to-all-row non-zero element multiplication is adopted for skipping redundant computation. The simulation results over the dense accelerator show that the proposed accelerator achieves 1.79x speedup and reduces 23.51%, 69.53%, 88.67% on-chip memory size, energy, and DRAM accesses of VGG-16.

Year	DOI	Venue
2019	10.1109/AICAS.2019.8771600	2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)
Keywords	Field	DocType
energy-efficient accelerator,sparse compressed CNNs,DRAM accesses,weight compression,on-chip memory size,indexing memory,sparse compressed convolutional neural network,image recognition,feature classification,deep CNNs,memory-intensive workloads,energy efficiency,off-chip memory accesses,convolution computation,dense accelerator,all-row-to-all-row nonzero element multiplication,zero-operand computation,relative-indexing memory,deep convolutional neural network,computation-intensive workloads,ReLU function,zero-valued activations,task parallelism	Dram,Convolutional neural network,Convolution,Task parallelism,Computer science,Parallel computing,Search engine indexing,Bandwidth (signal processing),Multiplication,Speedup	Conference
ISBN	Citations	PageRank
978-1-5386-7885-5	0	0.34
References	Authors
0	4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
I-Chen Wu	1	208	55.03
Po-tsang Huang	2	130	21.23
Chin-Yang Lo	3	0	0.68
Wei Hwang	4	254	44.40

1