Efficient Deconvolution Architecture for Heterogeneous Systems-on-Chip. - Citegraph

Paper Info

Title
Efficient Deconvolution Architecture for Heterogeneous Systems-on-Chip.

Abstract
Today, convolutional and deconvolutional neural network models are exceptionally popular thanks to the impressive accuracies they have been proven in several computer-vision applications. To speed up the overall tasks of these neural networks, purpose-designed accelerators are highly desirable. Unfortunately, the high computational complexity and the huge memory demand make the design of efficient hardware architectures, as well as their deployment in resource- and power-constrained embedded systems, still quite challenging. This paper presents a novel purpose-designed hardware accelerator to perform 2D deconvolutions. The proposed structure applies a hardware-oriented computational approach that overcomes the issues of traditional deconvolution methods, and it is suitable for being implemented within any virtually system-on-chip based on field-programmable gate array devices. In fact, the novel accelerator is simply scalable to comply with resources available within both high- and low-end devices by adequately scaling the adopted parallelism. As an example, when exploited to accelerate the Deep Convolutional Generative Adversarial Network model, the novel accelerator, running as a standalone unit implemented within the Xilinx Zynq XC7Z020 System-on-Chip (SoC) device, performs up to 72 GOPs. Moreover, it dissipates less than 500 mW@200 MHz and occupies similar to 5.6%, similar to 4.1%, similar to 17%, and similar to 96%, respectively, of the look-up tables, flip-flops, random access memory, and digital signal processors available on-chip. When accommodated within the same device, the whole embedded system equipped with the novel accelerator performs up to 54 GOPs and dissipates less than 1.8 W@150 MHz. Thanks to the increased parallelism exploitable, more than 900 GOPs can be executed when the high-end Virtex-7 XC7VX690T device is used as the implementation platform. Moreover, in comparison with state-of-the-art competitors implemented within the Zynq XC7Z045 device, the system proposed here reaches a computational capability up to similar to 20% higher, and saves more than 60% and 80% of power consumption and logic resources requirement, respectively, using similar to 5.7x fewer on-chip memory resources.

Year	DOI	Venue
2020	10.3390/jimaging6090085	JOURNAL OF IMAGING
Keywords	DocType	Volume
image deconvolution,generative adversarial networks (GANs),field-programmable gate array (FPGA),heterogeneous embedded systems	Journal	6
Issue	ISSN	Citations
9	2313-433X	2
PageRank	References	Authors
0.41	0	4

Authors (4 rows)

Cited by (2 rows)

References (0 rows)

Name	Order	Citations	PageRank
Stefania Perri	1	264	33.11
Cristian Sestito	2	3	0.81
Fanny Spagnolo	3	8	4.00
Pasquale Corsonello	4	278	38.06

1