Title
LARA: Locality-aware resource allocation to improve GPU memory-access time
Abstract
Memory access as a primary performance bottleneck of each processing unit also plays a significant role in GPU performance. In addition to high challenging parts of GPU’s memory access path, the low locality property among the requests considerably increases the memory access delay. Despite the GPU’s immense processing power, they cannot reach their maximum throughput values because of the memory access bottlenecks. Memory divergence and miss locality among the L1 missed requests significantly impose the Last-Level-Cache contention and main memory row switching overheads. In addition, interconnection network routes the request packets regardless of locality properties, such routing algorithm considerably disrupts the locality among the requests. In this paper, we proposed Locality-Aware Resource Allocation (LARA) to reduce the Streaming-Multiprocessors stall time with arbitrage among the memory request packets in favor of locality maintenance at the interconnection network of GPU. In addition, before injecting the memory requests to the interconnection network, they will be reordered at the injection-port buffer based on their thread block equality. Memory-divergence and miss-locality among the requests are two main factors that increase the rates of Last-Level-Cache contention and main memory row switching. We proposed a comprehensive approach to improving the GPU performance by decreasing the average memory access delay. We focused on the request locality property to decrease the Last-Level-Cache contention overheads and main memory row switching rate. As a result, 33% maximum and 17% average speed-up improvements among the used benchmarks, without significant effect on system areas and power consumptions, are reported.
Year
DOI
Venue
2021
10.1007/s11227-021-03854-w
The Journal of Supercomputing
Keywords
DocType
Volume
Cache contention, Memory divergence, Graphics Processing Unit (GPU), GPU-NoC, Interconnection network, Locality, Memory, Priority, Row access, Stall time
Journal
77
Issue
ISSN
Citations 
12
0920-8542
0
PageRank 
References 
Authors
0.34
9
2
Name
Order
Citations
PageRank
Hossein BiTalebi100.34
Farshad Safaei29519.37