Title
Network-Friendly One-Sided Communication through Multinode Cooperation on Petascale Cray XT5 Systems
Abstract
One-sided communication is important to enable asynchronous communication and data movement for Global Address Space (GAS) programming models. Such communication is typically realized through direct messages between initiator and target processes. For peta scale systems with 10,000s of nodes and 100,000s of cores, these direct messages require dedicated communication buffers and/or channels, which can lead to significant scalability challenges for GAS programming models. In this paper, we describe a network-friendly communication model, multinode cooperation, to enable indirect one-sided communication. Compute nodes work together to handle one-side requests through (1) request forwarding in which one node can intercept a request and forward it to a target node, and (2) request aggregation in which one node can aggregate many requests to a target node. We have implemented multinode cooperation for a popular GAS runtime library, Aggregate Remote Memory Copy Interface (ARMCI). Our experimental results on a large scale Cray XT5 system demonstrate that multinode cooperations able to greatly increase memory scalability by reducing communication buffers required on each node. In addition, multinode cooperation improves the resiliency of GAS runtime system to network contention. Furthermore, multinode cooperation can benefit the performance of scientific applications. In one case, it reduces the total execution time of an NWChem application by 52%.
Year
DOI
Venue
2011
10.1109/CCGrid.2011.62
CCGrid
Keywords
Field
DocType
network-friendly one-sided communication,gas,multinode cooperation,parallel programming,direct message,cooperative communication,gas runtime library,petascale cray xt5 systems,memory scalability,petascale cray xt5 system,gas programming model,buffer storage,network-friendly communication model,one-sided communication,network friendly one sided communication,indirect one-sided communication,telecommunication computing,aggregate remote memory copy interface,communication buffer,request aggregation,telecommunication network management,global address space programming model,multinode cooperationis,message passing,armci,asynchronous communication,target node,programming,scalability,servers,bandwidth
Asynchronous communication,Programming paradigm,Computer science,Models of communication,Runtime library,Cray XT5,Petascale computing,Distributed computing,Runtime system,Scalability
Conference
ISBN
Citations 
PageRank 
978-0-7695-4395-6
0
0.34
References 
Authors
10
5
Name
Order
Citations
PageRank
Xinyu Que112411.81
Weikuan Yu2104277.40
Vinod Tipparaju363846.25
Vetter, Jeffrey42383186.44
Bin Wang51208.13