Abstract | ||
---|---|---|
A stencil computation repeatedly updates each point of a d-dimensional grid as a function of itself and its near neighbors. Parallel cache-efficient stencil algorithms based on "trapezoidal decompositions" are known, but most programmers find them difficult to write. The Pochoir stencil compiler allows a programmer to write a simple specification of a stencil in a domain-specific stencil language embedded in C++ which the Pochoir compiler then translates into high-performing Cilk code that employs an efficient parallel cache-oblivious algorithm. Pochoir supports general d-dimensional stencils and handles both periodic and aperiodic boundary conditions in one unified algorithm. The Pochoir system provides a C++ template library that allows the user's stencil specification to be executed directly in C++ without the Pochoir compiler (albeit more slowly), which simplifies user debugging and greatly simplified the implementation of the Pochoir compiler itself. A host of stencil benchmarks run on a modern multicore machine demonstrates that Pochoir outperforms standard parallelloop implementations, typically running 2-10 times faster. The algorithm behind Pochoir improves on prior cache-efficient algorithms on multidimensional grids by making "hyperspace" cuts, which yield asymptotically more parallelism for the same cache efficiency. |
Year | DOI | Venue |
---|---|---|
2011 | 10.1145/1989493.1989508 | SPAA |
Keywords | Field | DocType |
pochoir compiler,domain-specific stencil language,pochoir system,parallel cache-efficient stencil,stencil benchmarks,stencil computation,pochoir stencil compiler,efficient parallel cache-oblivious algorithm,stencil specification,general d-dimensional stencil,c,compiler,cache oblivious algorithm,parallel computation,multicore | Cache-oblivious algorithm,Cache,Computer science,Parallel computing,Stencil,Stencil code,Compiler,Cilk,Multi-core processor,Debugging,Distributed computing | Conference |
Citations | PageRank | References |
80 | 2.71 | 21 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yuan Tang | 1 | 100 | 5.07 |
Rezaul Alam Chowdhury | 2 | 354 | 26.55 |
Bradley C. Kuszmaul | 3 | 1563 | 146.28 |
Chi-Keung Luk | 4 | 2537 | 116.49 |
Charles E. Leiserson | 5 | 5554 | 643.56 |