Abstract | ||
---|---|---|
Overlay architectures are a good way to enable fast development and debug on FPGAs at the expense of potentially limited performance when compared to fully customized FPGA designs. When used in concert with a hand-tuned FPGA solution, a performant overlay architecture can improve the time-to-solution and thus overall productivity of FPGA solutions. In this work, we tune and specialize FGPU, an open source OpenCL-programmable GPU overlay for FPGAs. We demonstrate that our PDL-FGPU architecture is able to maintain the ease-of-programming and generality of a software programmable soft GPU while achieving high performance due to specialization in the persistent deep learning domain. We also propose a easy method to specialize for different domains. PDL-FGPU includes new instructions, along with micro-architecture and compiler enhancements. We evaluate both the FGPU baseline and the proposed PDL-FGPU on a modern high-end Intel Stratix 10 2800 FPGA running a set of persistent DL applications (RNN, GRU, LSTM), as well as general non-DL applications to demonstrate generality. PDL-FGPU requires 1.5-3x more ALMs, 4.4-6.4x more M20ks, and 4.6-10x more DSPs than the FGPU baseline, but improves performance by 55-727x for persistent DL applications with an average 15% degradation on general non-PDL applications. We also demonstrate that the PDL-FGPU is only 4-7x slower than the Nvidia Volta V100 GPU. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/FPL.2019.00059 | 2019 29th International Conference on Field Programmable Logic and Applications (FPL) |
Keywords | Field | DocType |
overlay, specialization, FPGA, GPU, soft GPU, persistent deep learning, RNN | Stratix,Computer architecture,Computer science,Parallel computing,Field-programmable gate array,Compiler,Software,Artificial intelligence,Deep learning,Overlay,Generality,Debugging | Conference |
ISSN | ISBN | Citations |
1946-147X | 978-1-7281-4885-4 | 1 |
PageRank | References | Authors |
0.34 | 0 | 10 |
Name | Order | Citations | PageRank |
---|---|---|---|
Rui Ma | 1 | 1 | 1.02 |
Derek Chiou | 2 | 718 | 48.97 |
Jia-Ching Hsu | 3 | 1 | 0.68 |
Tian Tan | 4 | 3 | 2.11 |
Eriko Nurvitadhi | 5 | 399 | 33.08 |
David Sheffield | 6 | 33 | 3.54 |
Rob Pelt | 7 | 1 | 0.34 |
Martin Langhammer | 8 | 104 | 20.22 |
Jaewoong Sim | 9 | 384 | 17.25 |
Aravind Dasu | 10 | 10 | 4.47 |