Title
Optimizing Order-Associative Kernel Computation with Joint Memory Banking and Data Reuse.
Abstract
In this paper, we develop a joint strategy of memory banking and data reuse to specifically optimize the memory performance of any given order-associative and stencil-based computing kernel i.e., its iteration order can be reordered freely without compromising its correctness. Given any shape of stencil kernel, our methodology can achieve throughput of 1 kernel per clock cycle with only two memory banks and two data reuse buffers of constant small buffer sizes provided order-associativeness is given. This is a huge leap over all existing results for general stencil-based computing, where, depending the specific data reuse method, either a number of data reuse buffers proportional to the stencil size are required or a potentially problem-dependent reuse buffer size is needed. Furthermore, the optimal memory partition factor of existing methods is typically proportional to the actual stencil size of a given kernel, whereas in our method, the number of memory banks remains to be 2 irrespective of the stencil shape and size. On average, when compared with the mainstream methods, our approach achieves approximately 30-70% reduction in hardware usage, while improving performance by about 15%. Moreover, the number of independent memory banks required to accomplish conflict-free data accesses have dropped by more than 30%.
Year
DOI
Venue
2019
10.1145/3289602.3293980
FPGA
Field
DocType
ISBN
Kernel (linear algebra),Memory bank,Reuse,Computer science,Parallel computing,Correctness,Stencil,Field-programmable gate array,Throughput,Cycles per instruction
Conference
978-1-4503-6137-8
Citations 
PageRank 
References 
0
0.34
0
Authors
2
Name
Order
Citations
PageRank
Juan G. Rueda-Escobedo1144.96
Mingjie Lin27325.04