Title
Sentei: Filter-Wise Pruning With Distillation Towards Efficient Sparse Convolutional Neural Network Accelerators
Abstract
In the realization of convolutional neural networks (CNNs) in resource-constrained embedded hardware, the memory footprint of weights is one of the primary problems. Pruning techniques are often used to reduce the number of weights. However, the distribution of nonzero weights is highly skewed, which makes it more difficult to utilize the underlying parallelism. To address this problem, we present SENTEI*, filter-wise pruning with distillation, to realize hardware-aware network architecture with comparable accuracy. The filter-wise pruning eliminates weights such that each filter has the same number of nonzero weights, and retraining with distillation retains the accuracy. Further, we develop a zero-weight skipping inter-layer pipelined accelerator on an FPGA. The equalization enables inter-filter parallelism, where a processing block for a layer executes filters concurrently with straightforward architecture. Our evaluation of semantic-segmentation tasks indicates that the resulting mIoU only decreased by 0.4 points. Additionally, the speedup and power efficiency of our FPGA implementation were 33.2x and 87.9x higher than those of the mobile GPU. Therefore, our technique realizes hardware-aware network with comparable accuracy.
Year
DOI
Venue
2020
10.1587/transinf.2020PAP0013
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS
Keywords
DocType
Volume
sparse convolutional neural network, filter-wise pruning, distillation, FPGA
Journal
E103D
Issue
ISSN
Citations 
12
1745-1361
0
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
Masayuki Shimoda186.45
Youki Sada211.71
Ryosuke Kuramochi302.70
Shimpei Sato4122.94
Hiroki Nakahara515537.34