Title | ||
---|---|---|
Optimizing Order-Associative Kernel Computation with Joint Memory Banking and Data Reuse. |
Abstract | ||
---|---|---|
In this paper, we develop a joint strategy of memory banking and data reuse to specifically optimize the memory performance of any given order-associative and stencil-based computing kernel i.e., its iteration order can be reordered freely without compromising its correctness. Given any shape of stencil kernel, our methodology can achieve throughput of 1 kernel per clock cycle with only two memory banks and two data reuse buffers of constant small buffer sizes provided order-associativeness is given. This is a huge leap over all existing results for general stencil-based computing, where, depending the specific data reuse method, either a number of data reuse buffers proportional to the stencil size are required or a potentially problem-dependent reuse buffer size is needed. Furthermore, the optimal memory partition factor of existing methods is typically proportional to the actual stencil size of a given kernel, whereas in our method, the number of memory banks remains to be 2 irrespective of the stencil shape and size. On average, when compared with the mainstream methods, our approach achieves approximately 30-70% reduction in hardware usage, while improving performance by about 15%. Moreover, the number of independent memory banks required to accomplish conflict-free data accesses have dropped by more than 30%.
|
Year | DOI | Venue |
---|---|---|
2019 | 10.1145/3289602.3293980 | FPGA |
Field | DocType | ISBN |
Kernel (linear algebra),Memory bank,Reuse,Computer science,Parallel computing,Correctness,Stencil,Field-programmable gate array,Throughput,Cycles per instruction | Conference | 978-1-4503-6137-8 |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Juan G. Rueda-Escobedo | 1 | 14 | 4.96 |
Mingjie Lin | 2 | 73 | 25.04 |