Abstract | ||
---|---|---|
Convolutional Neural Networks (CNNs) are constituted of complex, slow convolutional layers and memory-demanding fully-connected layers. Current pruning techniques can reduce memory accesses and power consumption, but cannot speed up the convolutional layers. In this paper, we introduce a pruning technique able to reduce the number of kernels in convolutional layers of up to 90% with negligible accuracy degradation. We propose an architecture to accelerate fully-connected and convolutional computations within a single computational core, with power/energy consumption below mobile devices budget. The proposed pruning technique speeds up convolutional computations by up to 6.9×, reducing memory accesses by the same factor. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1109/NEWCAS.2018.8585517 | 2018 16th IEEE International New Circuits and Systems Conference (NEWCAS) |
Keywords | Field | DocType |
negligible accuracy degradation,convolutional computations,pruning technique,memory accesses,multimode accelerator,pruned deep neural networks,Convolutional Neural Networks,complex layers,memory-demanding,power consumption,energy consumption,convolutional layers | Kernel (linear algebra),Computer science,Convolutional neural network,Convolution,Parallel computing,Electronic engineering,Memory management,Sparse matrix,Pruning,Speedup,Computation | Conference |
ISSN | ISBN | Citations |
2472-467X | 978-1-5386-4860-5 | 0 |
PageRank | References | Authors |
0.34 | 0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Arash Ardakani | 1 | 33 | 8.42 |
Carlo Condo | 2 | 132 | 21.40 |
Warren J. Gross | 3 | 1106 | 113.38 |