Abstract | ||
---|---|---|
GPUs employ massive multithreading and fast context switching to provide high throughput and hide memory latency. Multithreading can Increase contention for various system resources, however, that may result In suboptimal utilization of shared resources. Previous research has proposed variants of throttling thread-level parallelism to reduce cache contention and improve performance. Throttling approaches can, however, lead to under-utilizing thread contexts, on-chip interconnect, and off-chip memory bandwidth. This paper proposes to tightly couple the thread scheduling mechanism with the cache management algorithms such that GPU cache pollution is minimized while off-chip memory throughput is enhanced. We propose priority-based cache allocation (PCAL) that provides preferential cache capacity to a subset of high-priority threads while simultaneously allowing lower priority threads to execute without contending for the cache. By tuning thread-level parallelism while both optimizing caching efficiency as well as other shared resource usage, PCAL builds upon previous thread throttling approaches, improving overall performance by an average 17% with maximum 51%. |
Year | DOI | Venue |
---|---|---|
2015 | 10.1109/HPCA.2015.7056024 | HPCA |
Keywords | DocType | ISSN |
on-chip interconnect,cache storage,context switching,gpu cache pollution,parallel architectures,off-chip memory bandwidth,graphics processing units,multi-threading,thread scheduling mechanism,multithreading,throttling thread-level parallelism,cache management algorithm,priority-based cache allocation,throughput processor | Conference | 1530-0897 |
Citations | PageRank | References |
35 | 0.80 | 13 |
Authors | ||
8 |
Name | Order | Citations | PageRank |
---|---|---|---|
Dong Li | 1 | 475 | 67.20 |
Minsoo Rhu | 2 | 322 | 16.80 |
Daniel R. Johnson | 3 | 304 | 14.54 |
Mike O'Connor | 4 | 155 | 5.03 |
Mattan Erez | 5 | 1543 | 88.21 |
Doug Burger | 6 | 6160 | 491.08 |
Donald S. Fussell | 7 | 35 | 1.14 |
Stephen W. Redder | 8 | 35 | 0.80 |