Abstract | ||
---|---|---|
Despite the vast interest in accelerator-based systems, programming large multinode GPUs is still a complex task, particularly with respect to optimal data movement across the host-GPU PCIe connection and then across the network. In order to address such issues, GPU-integrated MPI solutions have been developed that integrate GPU data movement into existing MPI implementations. Currently available GPU-integrated frameworks differ in aspects related to the buffer synchronization and ordering semantics they provide to users. The noteworthy models are (1) unified virtual addressing (UVA)-based approach and (2) MPI attributes-based approach. In this paper, we compare these approaches, for both programmability and performance, and demonstrate that the UVA-based design is useful for isolated communication with no data dependencies or ordering requirements, while the attributes-based design might be more appropriate when multiple interdependent MPI and GPU operations are interleaved. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1109/IPDPSW.2013.256 | IPDPS Workshops |
Keywords | Field | DocType |
gpu data movement,mpi attributes-based approach,uva-based design,ordering semantics,gpu operation,mpi implementation,data dependency,data movement,attributes-based design,hybrid mpi,gpu programming,gpu-integrated mpi solution,available gpu-integrated framework,message passing,data transfer,semantics,gpgpu,mpi,programming,synchronization,kernel | Computer architecture,Synchronization,Computer science,CUDA,Virtual address space,Parallel computing,Implementation,General-purpose computing on graphics processing units,PCI Express,Message passing,Semantics | Conference |
Citations | PageRank | References |
2 | 0.40 | 10 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ashwin M. Aji | 1 | 143 | 11.26 |
Pavan Balaji | 2 | 1475 | 111.48 |
James Dinan | 3 | 285 | 21.84 |
Wu-chun Feng | 4 | 2812 | 232.50 |
Rajeev Thakur | 5 | 3773 | 251.09 |