Title
Enhancing Utilization of SIMD-Like Accelerator for Sparse Convolutional Neural Networks
Abstract
Although the existing single-instruction–multiple-data-like (SIMD) accelerators can handle the compressed format of sparse convolutional neural networks, the sparse and irregular distributions of nonzero elements cause low utilization of multipliers in a processing engine (PE) and imbalanced computation between PEs. This brief addresses the above issues by proposing a data screening and task mapping (DSTM) accelerator which integrates a series of techniques, including software refinement and hardware modules. An efficient indexing module is introduced to identify the effectual computation pairs and skip unnecessary computation in a fine-grained manner. The intra-PE load imbalance is alleviated with weight data rearrangement. An effective task sharing mechanism further balances the computation between PEs. When compared with the state-of-the-art SIMD-like accelerator, the proposed DSTM enhances the average PE utilization by <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$3.5\times $ </tex-math></inline-formula> . The overall processing throughput is 59.7% higher than the previous design.
Year
DOI
Venue
2019
10.1109/TVLSI.2019.2897052
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Keywords
Field
DocType
Task analysis,Computer architecture,Indexing,System-on-chip,Very large scale integration,Convolutional neural networks
System on a chip,Convolutional neural network,Load balancing (computing),Computer science,Parallel computing,Search engine indexing,SIMD,Real-time computing,Throughput,Very-large-scale integration,Computation
Journal
Volume
Issue
ISSN
27
5
1063-8210
Citations 
PageRank 
References 
2
0.37
0
Authors
3
Name
Order
Citations
PageRank
Bo-Cheng Charles Lai117719.25
Jyun-Wei Pan220.37
Chien-Yu Lin320.71