Title
Concurrent warp execution: improving performance of GPU-likely SIMD architecture by increasing resource utilization
Abstract
Hardware parallelism should be exploited to improve the performance of computing systems. Single instruction multiple data (SIMD) architecture has been widely used to maximize the throughput of computing systems by exploiting hardware parallelism. Unfortunately, branch divergence due to branch instructions causes underutilization of computational resources, resulting in performance degradation of SIMD architecture. Graphics processing unit (GPU) is a representative parallel architecture based on SIMD architecture. In recent computing systems, GPUs can process general-purpose applications as well as graphics applications with the help of convenient APIs. However, contrary to graphics applications, general-purpose applications include many branch instructions, resulting in serious performance degradation of GPU due to branch divergence. In this paper, we propose concurrent warp execution (CWE) technique to reduce the performance degradation of GPU in executing general-purpose applications by increasing resource utilization. The proposed CWE enables selecting co-warps to activate more threads in the warp, leading to concurrent execution of combined warps. According to our simulation results, the proposed architecture provides a significant performance improvement (5.85 % over PDOM, 91 % over DWF) with little hardware overhead.
Year
DOI
Venue
2014
10.1007/s11227-014-1155-4
The Journal of Supercomputing
Keywords
Field
DocType
simd,gpu,resource utilization,branch divergence,parallel architecture
Graphics,Architecture,Computer architecture,Computer science,Parallel computing,SIMD,Thread (computing),Simd architecture,Throughput,Graphics processing unit,Performance improvement
Journal
Volume
Issue
ISSN
69
1
1573-0484
Citations 
PageRank 
References 
0
0.34
15
Authors
4
Name
Order
Citations
PageRank
Hong Jun Choi1305.74
Dong Oh Son2214.19
Jong-Myon Kim39125.99
Cheol Hong Kim47324.39