Abstract | ||
---|---|---|
Memory access as a primary performance bottleneck of each processing unit also plays a significant role in GPU performance. In addition to high challenging parts of GPU’s memory access path, the low locality property among the requests considerably increases the memory access delay. Despite the GPU’s immense processing power, they cannot reach their maximum throughput values because of the memory access bottlenecks. Memory divergence and miss locality among the L1 missed requests significantly impose the Last-Level-Cache contention and main memory row switching overheads. In addition, interconnection network routes the request packets regardless of locality properties, such routing algorithm considerably disrupts the locality among the requests. In this paper, we proposed Locality-Aware Resource Allocation (LARA) to reduce the Streaming-Multiprocessors stall time with arbitrage among the memory request packets in favor of locality maintenance at the interconnection network of GPU. In addition, before injecting the memory requests to the interconnection network, they will be reordered at the injection-port buffer based on their thread block equality. Memory-divergence and miss-locality among the requests are two main factors that increase the rates of Last-Level-Cache contention and main memory row switching. We proposed a comprehensive approach to improving the GPU performance by decreasing the average memory access delay. We focused on the request locality property to decrease the Last-Level-Cache contention overheads and main memory row switching rate. As a result, 33% maximum and 17% average speed-up improvements among the used benchmarks, without significant effect on system areas and power consumptions, are reported. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1007/s11227-021-03854-w | The Journal of Supercomputing |
Keywords | DocType | Volume |
Cache contention, Memory divergence, Graphics Processing Unit (GPU), GPU-NoC, Interconnection network, Locality, Memory, Priority, Row access, Stall time | Journal | 77 |
Issue | ISSN | Citations |
12 | 0920-8542 | 0 |
PageRank | References | Authors |
0.34 | 9 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Hossein BiTalebi | 1 | 0 | 0.34 |
Farshad Safaei | 2 | 95 | 19.37 |