Title
A High Performance Multi-Bit-Width Booth Vector Systolic Accelerator for NAS Optimized Deep Learning Neural Networks
Abstract
Multi-bit-width convolutional neural network (CNN) maintains the balance between network accuracy and hardware efficiency, thus enlightening a promising method for accurate yet energy-efficient edge computing. In this work, we develop state-of-the-art multi-bit-width accelerator for NAS Optimized deep learning neural networks. To efficiently process the multi-bit-width network inferencing, multi-level optimizations have been proposed. Firstly, differential Neural Architecture Search (NAS) method is adopted for the high accuracy multi-bit-width network generation. Secondly, hybrid Booth based multi-bit-width multiply-add-accumulation (MAC) unit is developed for data processing. Thirdly, vector systolic array is proposed for effectively accelerating the matrix multiplications. With vector-style systolic dataflow, both the processing time and logic resources consumption can be reduced when compared with the classical systolic array. Finally, The proposed multi-bit-width CNN acceleration scheme has been practically deployed on FPGA platform of Xilinx ZCU102. Average performance on accelerating the full NAS optimized VGG16 network is 784.2 GOPS, and peek performance of the convolutional layer can reach as high as 871.26 GOPS for INT8, 1676.96 GOPS for INT4, and 2863.29 GOPS for INT2 respectively, which is among the best results in previous CNN accelerator benchmarks.
Year
DOI
Venue
2022
10.1109/TCSI.2022.3178474
IEEE Transactions on Circuits and Systems I: Regular Papers
Keywords
DocType
Volume
\boldsymbol Multi-bit-width CNN,systolic array,NAS,FPGA CNN
Journal
69
Issue
ISSN
Citations 
9
1549-8328
0
PageRank 
References 
Authors
0.34
21
7
Name
Order
Citations
PageRank
Mingqiang Huang100.68
Yucen Liu200.68
Changhai Man300.68
Kai Li424.49
Quan Cheng500.34
Wei Mao600.68
Yu Yu76919.95