Specializing FGPU for Persistent Deep Learning - Citegraph

Paper Info

Title
Specializing FGPU for Persistent Deep Learning

Abstract
AbstractOverlay architectures are a good way to enable fast development and debug on FPGAs at the expense of potentially limited performance compared to fully customized FPGA designs. When used in concert with hand-tuned FPGA solutions, performant overlay architectures can improve time-to-solution and thus overall productivity of FPGA solutions. This work tunes and specializes FGPU, an open source OpenCL-programmable GPU overlay for FPGAs. We demonstrate that our persistent deep learning (PDL)-FGPU architecture maintains the ease-of-programming and generality of GPU programming while achieving high performance from specialization for the persistent deep learning domain. We also propose an easy method to specialize for other domains. PDL-FGPU includes new instructions, along with micro-architecture and compiler enhancements. We evaluate both the FGPU baseline and the proposed PDL-FGPU on a modern high-end Intel Stratix 10 2800 FPGA in simulation running persistent DL applications (RNN, GRU, LSTM), and non-DL applications to demonstrate generality. PDL-FGPU requires 1.4–3× more ALMs, 4.4–6.4× more M20ks, and 1–9.5× more DSPs than baseline, but improves performance by 56–693× for PDL applications with an average 23.1% degradation on non-PDL applications. We integrated the PDL-FGPU overlay into Intel OPAE to measure real-world performance/power and demonstrate that PDL-FGPU is only 4.0–10.4× slower than the Nvidia V100.

Year	DOI	Venue
2021	10.1145/3457886	ACM Transactions on Reconfigurable Technology and Systems
Keywords	DocType	Volume
Overlay, specialization, FPGA, GPU, soft GPU, persistent deep learning, RNN	Journal	14
Issue	ISSN	Citations
2	1936-7406	0
PageRank	References	Authors
0.34	0	10

Authors (10 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Rui Ma	1	1	1.02
Jia-Ching Hsu	2	0	0.34
Tian Tan	3	3	2.11
Eriko Nurvitadhi	4	0	0.34
David Sheffield	5	33	3.54
Rob Pelt	6	0	0.34
Martin Langhammer	7	104	20.22
Jaewoong Sim	8	384	17.25
Aravind Dasu	9	10	4.47
Derek Chiou	10	718	48.97

1