Title
Reconfigurable and Low-Complexity Accelerator for Convolutional and Generative Networks Over Finite Fields
Abstract
Convolutional neural networks (CNNs) have gained great success in various fields, such as computer vision and natural language processing. Besides, with the breakthrough in unsupervised learning, generative adversarial network (GAN) is recently utilized to generate virtual data from limited data sets. The generative model of GAN has impressive applications, such as style transfer and image super-resolution. However, the promising performance of CNN and GAN comes at the cost of prohibitive computation complexity. The convolution (CONV) in CNN and the transposed CONV (TCONV) in GAN are the two operations that dominant the overall complexity. The prior works exploit the fast algorithms, Winograd and fast Fourier transform (FFT), to reduce the complexity of spatial CONV. However, Winograd only supports fixed filter size while FFT has high transform overhead. Moreover, very few works apply fast algorithms to accelerate GAN models. In this article, a reconfigurable and low-complexity accelerator on ASIC for both CNN and GAN is proposed to address these problems. First, by exploiting Fermat number transform (FNT), we propose two FNT-based fast algorithms to reduce the complexity of CONV and TCONV computations, respectively. Then the architectures of the FNT-based accelerator are presented to implement the proposed fast algorithms. The methodology to determine the design parameters and optimize the dataflow is also described for obtaining maximum performance and optimal efficiency. Moreover, we implement the proposed accelerator on 65 nm 1P9M technology and evaluate it on various CNN and GAN models. The post-layout results show that our design achieves a throughput of 288.0 GOP/s on VGG-16 with 25.11 GOP/s/mm <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> area efficiency, which is superior to the state-of-the-art CNN accelerators. Furthermore, at least $1.7\times $ speedup over the existing accelerators is obtained on GAN. The resulting energy efficiency is $275.3\times $ and $12.5\times $ of CPU and GPU.
Year
DOI
Venue
2020
10.1109/TCAD.2020.2973355
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Keywords
DocType
Volume
Convolutional neural network (CNN),fast convolution (CONV),Fermat number transform (FNT),generative network,reconfigurable architectures
Journal
39
Issue
ISSN
Citations 
12
0278-0070
1
PageRank 
References 
Authors
0.34
0
4
Name
Order
Citations
PageRank
Weihong Xu1103.20
Zaichen Zhang213420.67
xiaohu you32529272.49
Chuan Zhang410013.67