Reconfigurable and Low-Complexity Accelerator for Convolutional and Generative Networks Over Finite Fields - Citegraph

Paper Info

Title
Reconfigurable and Low-Complexity Accelerator for Convolutional and Generative Networks Over Finite Fields

Abstract
Convolutional neural networks (CNNs) have gained great success in various fields, such as computer vision and natural language processing. Besides, with the breakthrough in unsupervised learning, generative adversarial network (GAN) is recently utilized to generate virtual data from limited data sets. The generative model of GAN has impressive applications, such as style transfer and image super-resolution. However, the promising performance of CNN and GAN comes at the cost of prohibitive computation complexity. The convolution (CONV) in CNN and the transposed CONV (TCONV) in GAN are the two operations that dominant the overall complexity. The prior works exploit the fast algorithms, Winograd and fast Fourier transform (FFT), to reduce the complexity of spatial CONV. However, Winograd only supports fixed filter size while FFT has high transform overhead. Moreover, very few works apply fast algorithms to accelerate GAN models. In this article, a reconfigurable and low-complexity accelerator on ASIC for both CNN and GAN is proposed to address these problems. First, by exploiting Fermat number transform (FNT), we propose two FNT-based fast algorithms to reduce the complexity of CONV and TCONV computations, respectively. Then the architectures of the FNT-based accelerator are presented to implement the proposed fast algorithms. The methodology to determine the design parameters and optimize the dataflow is also described for obtaining maximum performance and optimal efficiency. Moreover, we implement the proposed accelerator on 65 nm 1P9M technology and evaluate it on various CNN and GAN models. The post-layout results show that our design achieves a throughput of 288.0 GOP/s on VGG-16 with 25.11 GOP/s/mm <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> area efficiency, which is superior to the state-of-the-art CNN accelerators. Furthermore, at least $1.7\times $ speedup over the existing accelerators is obtained on GAN. The resulting energy efficiency is $275.3\times $ and $12.5\times $ of CPU and GPU.

Year	DOI	Venue
2020	10.1109/TCAD.2020.2973355	IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Keywords	DocType	Volume
Convolutional neural network (CNN),fast convolution (CONV),Fermat number transform (FNT),generative network,reconfigurable architectures	Journal	39
Issue	ISSN	Citations
12	0278-0070	1
PageRank	References	Authors
0.34	0	4

Authors (4 rows)

Cited by (1 rows)

References (0 rows)

Name	Order	Citations	PageRank
Weihong Xu	1	10	3.20
Zaichen Zhang	2	134	20.67
xiaohu you	3	2529	272.49
Chuan Zhang	4	100	13.67

1