Abstract | ||
---|---|---|
AbstractOverlay architectures are a good way to enable fast development and debug on FPGAs at the expense of potentially limited performance compared to fully customized FPGA designs. When used in concert with hand-tuned FPGA solutions, performant overlay architectures can improve time-to-solution and thus overall productivity of FPGA solutions. This work tunes and specializes FGPU, an open source OpenCL-programmable GPU overlay for FPGAs. We demonstrate that our persistent deep learning (PDL)-FGPU architecture maintains the ease-of-programming and generality of GPU programming while achieving high performance from specialization for the persistent deep learning domain. We also propose an easy method to specialize for other domains. PDL-FGPU includes new instructions, along with micro-architecture and compiler enhancements. We evaluate both the FGPU baseline and the proposed PDL-FGPU on a modern high-end Intel Stratix 10 2800 FPGA in simulation running persistent DL applications (RNN, GRU, LSTM), and non-DL applications to demonstrate generality. PDL-FGPU requires 1.4–3× more ALMs, 4.4–6.4× more M20ks, and 1–9.5× more DSPs than baseline, but improves performance by 56–693× for PDL applications with an average 23.1% degradation on non-PDL applications. We integrated the PDL-FGPU overlay into Intel OPAE to measure real-world performance/power and demonstrate that PDL-FGPU is only 4.0–10.4× slower than the Nvidia V100. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1145/3457886 | ACM Transactions on Reconfigurable Technology and Systems |
Keywords | DocType | Volume |
Overlay, specialization, FPGA, GPU, soft GPU, persistent deep learning, RNN | Journal | 14 |
Issue | ISSN | Citations |
2 | 1936-7406 | 0 |
PageRank | References | Authors |
0.34 | 0 | 10 |
Name | Order | Citations | PageRank |
---|---|---|---|
Rui Ma | 1 | 1 | 1.02 |
Jia-Ching Hsu | 2 | 0 | 0.34 |
Tian Tan | 3 | 3 | 2.11 |
Eriko Nurvitadhi | 4 | 0 | 0.34 |
David Sheffield | 5 | 33 | 3.54 |
Rob Pelt | 6 | 0 | 0.34 |
Martin Langhammer | 7 | 104 | 20.22 |
Jaewoong Sim | 8 | 384 | 17.25 |
Aravind Dasu | 9 | 10 | 4.47 |
Derek Chiou | 10 | 718 | 48.97 |