POSTER: Accelerate GPU Concurrent Kernel Execution by Mitigating Memory Pipeline Stalls - Citegraph

Paper Info

Title
POSTER: Accelerate GPU Concurrent Kernel Execution by Mitigating Memory Pipeline Stalls

Abstract
In this study, we demonstrate that the performance may be undermined in the state-of-the-art intra-SM sharing schemes for concurrent kernel execution (CKE) on GPUs, due to the interference among concurrent kernels. We highlight that cache partitioning techniques proposed for CPUs are not effective for GPUs. Then we propose to balance memory accesses and limit the number of inflight memory instructions issued from concurrent kernels to reduce memory pipeline stalls. Our proposed schemes significantly improve the performance of two state-of-the-art intra-SM sharing schemes, Warped-Slicer and SMK.

Year	DOI	Venue
2017	10.1109/PACT.2017.30	2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT)
Keywords	Field	DocType
GPUs,concurrent kernel execution,memory pipeline,memory subsystem	Kernel (linear algebra),Resource management,Pipeline transport,Computer science,Cache,Parallel computing,Real-time computing,Interference (wave propagation),Throughput,Benchmark (computing)	Conference
ISSN	ISBN	Citations
1089-795X	978-1-5090-6765-7	2
PageRank	References	Authors
0.36	4	7

Authors (7 rows)

Cited by (2 rows)

References (4 rows)

Name	Order	Citations	PageRank
Hongwen Dai	1	28	3.14
Zhen Lin	2	35	4.21
Chao Li	3	132	6.04
chen zhao	4	15	10.09
Fei Wang	5	203	40.33
Nanning Zheng	6	3975	329.18
Huiyang Zhou	7	994	63.26

1