Abstract | ||
---|---|---|
GPUs employ thousands of threads per core to achieve high throughput. These threads exhibit localities in control-flow, instruction and data addresses and values. In this study we investigate inter-warp instruction temporal locality and show that during short intervals a significant share of fetched instructions are fetched unnecessarily. This observation provides several opportunities to enhance GPUs. We discuss different possibilities and evaluate filter cache as a case study. Moreover, we investigate how variations in microarchitectural parameters impacts potential filter cache benefits in GPUs. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1007/978-3-642-36424-2_12 | ARCS |
Keywords | Field | DocType |
high throughput,filter cache,temporal locality,inter-warp instruction,deep-multithreaded gpus,different possibility,threads exhibit locality,case study,short interval,impacts potential filter cache,data address,microarchitectural parameter | Computer architecture,Locality of reference,Computer science,Parallel computing,Filter cache,Thread (computing),Throughput | Conference |
Volume | ISSN | Citations |
7767 | 0302-9743 | 3 |
PageRank | References | Authors |
0.37 | 12 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ahmad Lashgar | 1 | 16 | 5.45 |
Amirali Baniasadi | 2 | 221 | 33.12 |
Ahmad Khonsari | 3 | 210 | 42.43 |