Exploration of GPU sharing policies under GEMM workloads - Citegraph

Paper Info

Title
Exploration of GPU sharing policies under GEMM workloads

Abstract
Lately, cloud computing has seen explosive growth, due to the flexibility and scalability it offers. The ever-increasing computational demands, especially from the machine learning domain, have forced cloud operators to enhance their infrastructure with acceleration devices, such as General-Purpose (GP)GPUs or FPGAs. Even though multi-tenancy has been widely examined for conventional CPUs, this is not the case for accelerators. Current solutions support "one accelerator per user" schemes, which can lead to both under-utilization and starvation of available resources. In this work, we analyze the potentials of GPU sharing inside data-center environments. We investigate how several architectural features affect the performance of GPUs under different multi-tenant stressing scenarios. We compare CUDA MPS with the native, default CUDA scheduler and also with Vinetalk, a research framework providing GPU sharing capabilities. Experimental results show that NVIDIA's MPS achieves the best performance in multi-application scenarios, specifically up to X4.5 and X11.2 compared to native CUDA scheduler and Vinetalk respectively.

Year	DOI	Venue
2020	10.1145/3378678.3391887	SCOPES '20: 23rd International Workshop on Software and Compilers for Embedded Systems St. Goar Germany May, 2020
Keywords	DocType	ISBN
GPGPU sharing, native CUDA queues, CUDA MPS, Vinetalk, interference analysis, cloud computing	Conference	978-1-4503-7131-5
Citations	PageRank	References
1	0.37	0
Authors
5

Authors (5 rows)

Cited by (1 rows)

References (0 rows)

Name	Order	Citations	PageRank
Ioannis Oroutzoglou	1	3	0.75
Dimosthenis Masouros	2	12	6.37
Konstantina Koliogeorgi	3	1	2.06
Sotirios Xydis	4	144	31.51
Dimitrios Soudris	5	243	48.41

1