Abstract | ||
---|---|---|
Acceleration in the form of customized datapaths offer large performance and energy improvements over general purpose processors. Reconfigurable fabrics such as FPGAs are gaining popularity for use in implementing application-specific accelerators, thereby increasing the importance of having good high-level FPGA design tools. However, current tools for targeting FPGAs offer inadequate support for high-level programming, resource estimation, and rapid and automatic design space exploration. We describe a design framework that addresses these challenges. We introduce a new representation of hardware using parameterized templates that captures locality and parallelism information at multiple levels of nesting. This representation is designed to be automatically generated from high-level languages based on parallel patterns. We describe a hybrid area estimation technique which uses template-level models and design-level artificial neural networks to account for effects from hardware place-and-route tools, including routing overheads, register and block RAM duplication, and LUT packing. Our runtime estimation accounts for off-chip memory accesses. We use our estimation capabilities to rapidly explore a large space of designs across tile sizes, parallelization factors, and optional coarse-grained pipelining, all at multiple loop levels. We show that estimates average 4.8% error for logic resources, 6.1% error for runtimes, and are 279 to 6533 times faster than a commercial high-level synthesis tool. We compare the best-performing designs to optimized CPU code running on a server-grade 6 core processor and show speedups of up to 16.7×. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1109/ISCA.2016.20 | ISCA |
Keywords | Field | DocType |
hardware generation,design space exploration,FPGAs,parallel patterns,hardware definition language,reconfigurable hardware,application-specific accelerators | Lookup table,Pipeline (computing),Locality,Computer science,Parallel computing,Field-programmable gate array,Real-time computing,Artificial neural network,Design space exploration,Multi-core processor,Reconfigurable computing | Conference |
ISSN | ISBN | Citations |
1063-6897 | 978-1-4673-8948-8 | 19 |
PageRank | References | Authors |
0.70 | 22 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
David Koeplinger | 1 | 41 | 1.72 |
Raghu Prabhakar | 2 | 40 | 1.70 |
Yaqi Zhang | 3 | 44 | 2.12 |
Christina Delimitrou | 4 | 444 | 20.12 |
Christos Kozyrakis | 5 | 5817 | 355.99 |
Kunle Olukotun | 6 | 4532 | 373.50 |