Sentei: Filter-Wise Pruning With Distillation Towards Efficient Sparse Convolutional Neural Network Accelerators - Citegraph

Paper Info

Title
Sentei: Filter-Wise Pruning With Distillation Towards Efficient Sparse Convolutional Neural Network Accelerators

Abstract
In the realization of convolutional neural networks (CNNs) in resource-constrained embedded hardware, the memory footprint of weights is one of the primary problems. Pruning techniques are often used to reduce the number of weights. However, the distribution of nonzero weights is highly skewed, which makes it more difficult to utilize the underlying parallelism. To address this problem, we present SENTEI*, filter-wise pruning with distillation, to realize hardware-aware network architecture with comparable accuracy. The filter-wise pruning eliminates weights such that each filter has the same number of nonzero weights, and retraining with distillation retains the accuracy. Further, we develop a zero-weight skipping inter-layer pipelined accelerator on an FPGA. The equalization enables inter-filter parallelism, where a processing block for a layer executes filters concurrently with straightforward architecture. Our evaluation of semantic-segmentation tasks indicates that the resulting mIoU only decreased by 0.4 points. Additionally, the speedup and power efficiency of our FPGA implementation were 33.2x and 87.9x higher than those of the mobile GPU. Therefore, our technique realizes hardware-aware network with comparable accuracy.

Year	DOI	Venue
2020	10.1587/transinf.2020PAP0013	IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS
Keywords	DocType	Volume
sparse convolutional neural network, filter-wise pruning, distillation, FPGA	Journal	E103D
Issue	ISSN	Citations
12	1745-1361	0
PageRank	References	Authors
0.34	0	5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Masayuki Shimoda	1	8	6.45
Youki Sada	2	1	1.71
Ryosuke Kuramochi	3	0	2.70
Shimpei Sato	4	12	2.94
Hiroki Nakahara	5	155	37.34

1