Two-stage Asynchronous Iterative Solvers for multi-GPU Clusters - Citegraph

Paper Info

Title
Two-stage Asynchronous Iterative Solvers for multi-GPU Clusters

Abstract
Given the trend of supercomputers accumulating much of their compute power in GPU accelerators composed of thousands of cores and operating in streaming mode, global synchronization points become a bottleneck, severely confining the performance of applications. In consequence, asynchronous methods breaking up the bulk-synchronous programming model are becoming increasingly attractive. In this paper, we study a GPU-focused asynchronous version of the Restricted Additive Schwarz (RAS) method that employs preconditioned Krylov subspace methods as subdomain solvers. We analyze the method for various parameters such as local solver tolerance and iteration counts. Leveraging the multi-GPU architecture on Summit, we show that these two-stage methods are more memory and time efficient than asynchronous RAS using direct solvers. We also demonstrate the superiority over synchronous counterparts, and present results using one-sided CUDA-aware MPI on up to 36 NVIDIA V100 GPUs.

Year	DOI	Venue
2020	10.1109/ScalA51936.2020.00007	2020 IEEE/ACM 11th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA)
Keywords	DocType	ISBN
Asynchronous iterative methods,Schwarz methods,GPUs,Krylov subspace solvers	Conference	978-1-6654-2271-0
Citations	PageRank	References
0	0.34	8
Authors
3

Authors (3 rows)

Cited by (0 rows)

References (8 rows)

Name	Order	Citations	PageRank
Pratik Nayak	1	0	0.34
Terry Cojean	2	9	4.27
Hartwig Anzt	3	0	0.34

1