Title
Memory reuse optimizations in the R-Stream compiler.
Abstract
We propose a new set of automated techniques to optimize memory reuse in programs with explicitly managed memory. Our techniques are inspired by hand-tuned seismic kernels on GPUs. The solutions we develop reduce the cost of transferring data across multiple memories with different bandwidth, latency and addressability properties. They result in reduction of communication volumes from main memory and faster execution speeds, comparable to hand-tuned implementations, for out-of-place stencils. We discuss various steps of our source-to-source compiler infrastructure and focus on specific optimizations which comprise: flexible generation of different granularities of communications with respect to computations, reduction of redundant transfers, reuse of data across processing elements using a globally addressable local memory and reuse of data within the same processing elements using a local private memory. The models of memory we consider in our techniques support the GPU model with device, shared and register memories. The techniques we derive are generally applicable and their formulation within our compiler can be extended to other types of architectures.
Year
DOI
Venue
2013
10.1145/2458523.2458528
GPGPU@ASPLOS
Keywords
Field
DocType
hand-tuned seismic kernel,processing element,r-stream compiler,main memory,hand-tuned implementation,different granularity,memory reuse,memory reuse optimizations,different bandwidth,local private memory,multiple memory,addressable local memory,gpgpu,polyhedral model,parallelization
Computer architecture,Uniform memory access,Shared memory,Computer science,Parallel computing,Memory ordering,Cache-only memory architecture,Memory management,Memory model,Flat memory model,Distributed shared memory
Conference
Citations 
PageRank 
References 
3
0.40
15
Authors
4
Name
Order
Citations
PageRank
Nicolas Vasilache135419.45
Muthu Baskaran21185.88
Benoît Meister313812.84
Richard Lethin411817.17