Title
Cartesian Collective Communication
Abstract
We introduce Cartesian Collective Communication as sparse, collective communication defined on processes (processors) organized into d-dimensional tori or meshes. Processes specify local neighborhoods, e.g., stencil patterns, by lists of relative Cartesian coordinate offsets. The Cartesian collective operations perform data exchanges (and reductions) over the set of all neighborhoods such that each process communicates with the processes in its local neighborhood. The key requirement is that local neighborhoods must be structurally identical (isomorphic). This makes it possible for processes to compute correct, deadlock-free, efficient communication schedules for the collective operations locally without any interaction with other processes. Cartesian Collective Communication substantially extends collective neighborhood communication on Cartesian communicators as defined by the MPI standard, and is a restricted form of neighborhood collective communication on general, distributed graph topologies. We show that the restriction to isomorphic neighborhoods permits communication improvements beyond what is possible for unrestricted graph topologies by presenting non-trivial message-combining algorithms that reduce communication latency for Cartesian alltoall and allgather collective operations. For both types of communication, the required communication schedules can be computed in linear time in the size of the input neighborhood. Our benchmarks show that we can, for small data block sizes, substantially outperform the general MPI neighborhood collectives implementing the same communication pattern. We discuss different possibilities for supporting Cartesian Collective Communication in MPI. Our library is implemented on top of MPI and uses the same signatures for the collective communication operations as the MPI (neighborhood) collectives. Our implementation requires essentially only one single, new communicator creation function, but even this might not be needed for implementation in an MPI library.
Year
DOI
DocType
2019
10.1145/3337821.3337848
Conference
ISSN
ISBN
Citations 
978-1-4503-6295-5
978-1-4503-6295-5
0
PageRank 
References 
Authors
0.34
0
2
Name
Order
Citations
PageRank
Jesper Larsson Träff184594.16
Sascha Hunold212120.11