Abstract | ||
---|---|---|
Driven by its high flexibility, good performance and energy efficiency, GPGPU has taken on an increasingly important role in embedded systems. In this paper, we present the basic core of FGPU: a GPU-like, scalable and portable integer soft SIMT-processor implemented in RTL and optimized for FPGA synthesis with a single-level cache system. Compared to a performance-optimized MicroBlaze implementation on the same FPGA, the biggest implemented core of FGPU achieves average wall clock speedups of 49x and a measured power saving of 3.7x with an area overhead of 17.7x. Compared to an ARM CPU with a NEON vector processor, we measured an average speedup of 3.5x over the used benchmark. FGPU is highly parametrizable and it does not contain any manufacturer-specific IP-cores or primitives. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1145/2847263.2847273 | ACM/SIGDA International Symposium on Field-Programmable Gate Arrays |
Keywords | Field | DocType |
GPGPU, SIMT, softGPU, FPGA | MicroBlaze,ARM architecture,Cache,Computer science,Parallel computing,Field-programmable gate array,Real-time computing,General-purpose computing on graphics processing units,Vector processor,Embedded system,Scalability,Speedup | Conference |
Citations | PageRank | References |
8 | 0.59 | 11 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Muhammed Al Kadi | 1 | 19 | 3.91 |
Benedikt Janßen | 2 | 13 | 3.51 |
Hubner, Michael | 3 | 390 | 47.98 |