Abstract | ||
---|---|---|
Due to their massive computational power, graphics processing units (GPUs) have become a popular platform for executing general purpose parallel applications. GPU programming models allow the programmer to create thousands of threads, each executing the same computing kernel. GPUs exploit this parallelism in two ways. First, threads are grouped into fixed-size SIMD batches known as warps, and second, many such warps are concurrently executed on a single GPU core. Despite these techniques, the computational resources on GPU cores are still underutilized, resulting in performance far short of what could be delivered. Two reasons for this are conditional branch instructions and stalls due to long latency operations. To improve GPU performance, computational resources must be more effectively utilized. To accomplish this, we propose two independent ideas: the large warp microarchitecture and two-level warp scheduling. We show that when combined, our mechanisms improve performance by 19.1% over traditional GPU cores for a wide variety of general purpose parallel applications that heretofore have not been able to fully exploit the available resources of the GPU chip. |
Year | DOI | Venue |
---|---|---|
2011 | 10.1145/2155620.2155656 | MICRO |
Keywords | Field | DocType |
gpu programming model,gpu chip,general purpose parallel application,computational resource,traditional gpu core,two-level warp scheduling,large warp microarchitecture,single gpu core,gpu performance,gpu core,massive computational power,improving gpu performance,gpgpu,gpu programming,chip,simd,divergence | Graphics,Programmer,CUDA,Computer science,Scheduling (computing),Parallel computing,SIMD,Thread (computing),General-purpose computing on graphics processing units,Microarchitecture | Conference |
ISSN | ISBN | Citations |
1072-4451 | 978-1-5090-6605-6 | 192 |
PageRank | References | Authors |
5.59 | 19 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Veynu Narasiman | 1 | 241 | 7.34 |
Michael Shebanow | 2 | 351 | 109.92 |
Chang Joo Lee | 3 | 506 | 15.30 |
Rustam Miftakhutdinov | 4 | 308 | 9.01 |
Onur Mutlu | 5 | 9446 | 357.40 |
Yale N. Patt | 6 | 4947 | 566.20 |