Title
Double MAC: Doubling the performance of convolutional neural networks on modern FPGAs.
Abstract
This paper presents a novel method to double the computation rate of convolutional neural network (CNN) accelerators by packing two multiply-and-accumulate (MAC) operations into one DSP block of off-the-shelf FPGAs (called Double MAC). While a general SIMD MAC using a single DSP block seems impossible, our solution is tailored for the kind of MAC operations required for a convolution layer. Our preliminary evaluation shows that not only can our Double MAC approach increase the computation throughput of a CNN layer by twice with essentially the same resource utilization, the network level performance can also be improved by 14∼84% over a highly optimized state-of-the-art accelerator solution depending on the CNN hyper-parameters.
Year
Venue
Field
2017
DATE
Network level,Digital signal processing,Convolutional neural network,Convolution,Computer science,Parallel computing,Field-programmable gate array,SIMD,Throughput,Computation
DocType
ISSN
Citations 
Conference
1530-1591
3
PageRank 
References 
Authors
0.41
8
3
Name
Order
Citations
PageRank
Dong Nguyen168249.92
Daewoo Kim281.24
Jongeun Lee342933.71