Title
Taming irregular applications via advanced dynamic parallelism on GPUs
Abstract
ABSTRACTOn recent GPU architectures, dynamic parallelism, which enables the launching of kernels from the GPU without CPU involvement, provides a way to improve the performance of irregular applications by generating child kernels dynamically to reduce workload imbalance and improve GPU utilization. However, in practice, dynamic parallelism does not improve performance due to high kernel launch overhead and low child kernel occupancy. Consequently, most existing studies focus on mitigating the kernel launch overhead. As the kernel launch overhead has decreased due to algorithmic redesigns and hardware architectural innovations, the organization of subtasks to child kernels becomes a new performance bottleneck. We present an in-depth characterization of existing software approaches for dynamic parallelism optimizations on the latest GPUs. We observe that current approaches of subtask aggregation, which use the "one-size-fits-all" method by treating all subtasks equally, can under-utilize resources and degrade overall performance, as different subtasks require various configurations for optimal performance. To address this problem, we leverage statistical and machine-learning techniques and propose a performance modeling and task scheduling tool that can (1) analyze the performance characteristics of subtasks to identify the critical performance factors, (2) predict the performance of new subtasks, and (3) generate the optimal aggregation strategy for new subtasks. Experimental results show that our approach with the optimal subtask aggregation strategy can achieve up to a 1.8-fold speedup over the existing task aggregation approach for dynamic parallelism.
Year
DOI
Venue
2018
10.1145/3203217.3203243
CF
Keywords
Field
DocType
Dynamic parallelism, irregular applications, performance modeling, GPU
Kernel (linear algebra),Bottleneck,Computer science,Workload,Scheduling (computing),Parallel computing,Software,Speedup
Conference
Citations 
PageRank 
References 
2
0.38
22
Authors
5
Name
Order
Citations
PageRank
Jing Zhang1706.53
Ashwin M. Aji214311.26
Michael L. Chu320.72
Hao Wang440224.64
Wu-chun Feng52812232.50