Title
Early experiences co-scheduling work and communication tasks for hybrid MPI+X applications
Abstract
Advances in node-level architecture and interconnect technology needed to reach extreme scale necessitate a reevaluation of long-standing models of computation, in particular bulk synchronous processing. The end of Dennard-scaling and subsequent increases in CPU core counts each successive generation of general purpose processor has made the ability to leverage parallelism for communication an increasingly critical aspect for future extreme-scale application performance. But the use of massive multithreading in combination with MPI is an open research area, with many proposed approaches requiring code changes that can be unfeasible for important large legacy applications already written in MPI. This paper covers the design and initial evaluation of an extension of a massive multithreading runtime system supporting dynamic parallelism to interface with MPI to handle fine-grain parallel communication and communication-computation overlap. Our initial evaluation of the approach uses the ubiquitous stencil computation, in three dimensions, with the halo exchange as the driving example that has a demonstrated tie to real code bases. The preliminary results suggest that even for a very well-studied and balanced workload and message exchange pattern, co-scheduling work and communication tasks is effective at significant levels of decomposition using up to 131,072 cores. Furthermore, we demonstrate useful communication-computation overlap when handling blocking send and receive calls, and show evidence suggesting that we can decrease the burstiness of network traffic, with a corresponding decrease in the rate of stalls (congestion) seen on the host link and network.
Year
DOI
Venue
2014
10.1109/ExaMPI.2014.6
ExaMPI@SC
Keywords
Field
DocType
computational modeling,instruction sets,programming,parallel processing
Multithreading,Instruction set,Parallel communication,Computer science,Parallel computing,Stencil code,Burstiness,Model of computation,Multi-core processor,Runtime system,Distributed computing
Conference
Citations 
PageRank 
References 
10
0.58
33
Authors
6
Name
Order
Citations
PageRank
Dylan T. Stark1582.82
Richard F. Barrett228622.94
Ryan E. Grant3283.34
Stephen L. Olivier4292.77
Kevin T. Pedretti519621.20
Courtenay T. Vaughan6809.84