Abstract | ||
---|---|---|
Today, convolutional and deconvolutional neural network models are exceptionally popular thanks to the impressive accuracies they have been proven in several computer-vision applications. To speed up the overall tasks of these neural networks, purpose-designed accelerators are highly desirable. Unfortunately, the high computational complexity and the huge memory demand make the design of efficient hardware architectures, as well as their deployment in resource- and power-constrained embedded systems, still quite challenging. This paper presents a novel purpose-designed hardware accelerator to perform 2D deconvolutions. The proposed structure applies a hardware-oriented computational approach that overcomes the issues of traditional deconvolution methods, and it is suitable for being implemented within any virtually system-on-chip based on field-programmable gate array devices. In fact, the novel accelerator is simply scalable to comply with resources available within both high- and low-end devices by adequately scaling the adopted parallelism. As an example, when exploited to accelerate the Deep Convolutional Generative Adversarial Network model, the novel accelerator, running as a standalone unit implemented within the Xilinx Zynq XC7Z020 System-on-Chip (SoC) device, performs up to 72 GOPs. Moreover, it dissipates less than 500 mW@200 MHz and occupies similar to 5.6%, similar to 4.1%, similar to 17%, and similar to 96%, respectively, of the look-up tables, flip-flops, random access memory, and digital signal processors available on-chip. When accommodated within the same device, the whole embedded system equipped with the novel accelerator performs up to 54 GOPs and dissipates less than 1.8 W@150 MHz. Thanks to the increased parallelism exploitable, more than 900 GOPs can be executed when the high-end Virtex-7 XC7VX690T device is used as the implementation platform. Moreover, in comparison with state-of-the-art competitors implemented within the Zynq XC7Z045 device, the system proposed here reaches a computational capability up to similar to 20% higher, and saves more than 60% and 80% of power consumption and logic resources requirement, respectively, using similar to 5.7x fewer on-chip memory resources. |
Year | DOI | Venue |
---|---|---|
2020 | 10.3390/jimaging6090085 | JOURNAL OF IMAGING |
Keywords | DocType | Volume |
image deconvolution,generative adversarial networks (GANs),field-programmable gate array (FPGA),heterogeneous embedded systems | Journal | 6 |
Issue | ISSN | Citations |
9 | 2313-433X | 2 |
PageRank | References | Authors |
0.41 | 0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Stefania Perri | 1 | 264 | 33.11 |
Cristian Sestito | 2 | 3 | 0.81 |
Fanny Spagnolo | 3 | 8 | 4.00 |
Pasquale Corsonello | 4 | 278 | 38.06 |