Abstract | ||
---|---|---|
A technique to speed up stencil computation is introduced. Computation and data reuse schemes are developed for its application to 1- and 3-dimensional stencils. The approach traverses the data domain fewer times than a state-of-the-art, straightforward iterative stencil implementation would. Performance results are shown for a variety of platforms, exemplifying how it can be straightforwardly applied with existing techniques and frameworks. The technique, named Aggregate Stencil-Loop Iteration (ASLI), works by applying a stencil obtained by the original stencil operator convolved with itself one or more times. This more complex operator creates new opportunities for in-register data reuse and increases the FLOPs-to-load ratio. The total number of FLOPs decreases for 1D but increases for 2D and 3D star-shaped stencils. In both scenarios, speed-up relative to the state-of-the-art is achieved. ASLI is relatively easy to implement and works synergistically with existing methods to optimize stencil computations. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1109/SBAC-PAD.2016.18 | 2016 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) |
Keywords | Field | DocType |
stencil computation,optimization,high-performance computing,numeric kernels | Kernel (linear algebra),Data domain,Supercomputer,Convolution,Computer science,Stencil,Parallel computing,Stencil code,Speedup,Computation | Conference |
ISSN | ISBN | Citations |
1550-6533 | 978-1-5090-6109-9 | 0 |
PageRank | References | Authors |
0.34 | 1 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Guilherme C. Januario | 1 | 21 | 3.87 |
Bryan Rosenburg | 2 | 225 | 17.35 |
Yoonho Park | 3 | 350 | 35.57 |
Michael Perrone | 4 | 0 | 0.68 |
José E. Moreira | 5 | 2282 | 230.26 |
Tereza Cristina M. B. Carvalho | 6 | 57 | 15.32 |