Title
Optimization principles for collective neighborhood communications
Abstract
Many scientific applications operate in a bulk-synchronous mode of iterative communication and computation steps. Even though the communication steps happen at the same logical time, important patterns such as stencil computations cannot be expressed as collective communications in MPI. We demonstrate how neighborhood collective operations allow to specify arbitrary collective communication relations during run-time and enable optimizations similar to traditional collective calls. We show a number of optimization opportunities and algorithms for different communication scenarios. We also show how users can assert constraints that provide additional optimization opportunities in a portable way. We demonstrate the utility of all described optimizations in a highly optimized implementation of neighborhood collective operations. Our communication and protocol optimizations result in a performance improvement of up to a factor of two for small stencil communications. We found that, for some patterns, our optimization heuristics automatically generate communication schedules that are comparable to hand-tuned collectives. With those optimizations in place, we are able to accelerate arbitrary collective communication patterns, such as regular and irregular stencils with optimization methods for collective communications. We expect that our methods will influence the design of future MPI libraries and provide a significant performance benefit on large-scale systems.
Year
DOI
Venue
2012
10.1109/SC.2012.86
High Performance Computing, Networking, Storage and Analysis
Keywords
Field
DocType
application program interfaces,iterative methods,message passing,parallel programming,protocols,software performance evaluation,MPI libraries,automatically communication schedules generate,bulk-synchronous mode,collective neighborhood communications,irregular stencils,iterative communication,iterative computation steps,large-scale systems,neighborhood collective operations,optimization heuristics,performance improvement,protocol optimization principles,stencil communications,stencil computations
Iterative method,Computer science,Parallel computing,Stencil,Heuristics,Schedule,Multi-core processor,Message passing,Performance improvement,Computation,Distributed computing
Conference
ISSN
ISBN
Citations 
2167-4329
978-1-4673-0805-2
17
PageRank 
References 
Authors
0.91
29
2
Name
Order
Citations
PageRank
Torsten Hoefler12197163.64
Timo Schneider231218.39