Title
A Framework for Memory Oversubscription Management in Graphics Processing Units
Abstract
Modern discrete GPUs support unified memory and demand paging. Automatic management of data movement between CPU memory and GPU memory dramatically reduces developer effort. However, when application working sets exceed physical memory capacity, the resulting data movement can cause great performance loss. This paper proposes a memory management framework, called ETC, that transparently improves GPU performance under memory oversubscription using new techniques to overlap eviction latency of GPU pages, reduce thrashing cost, and increase effective memory capacity. Eviction latency can be hidden by eagerly creating space for demand-paged data with proactive eviction (E). Thrashing costs can be ameliorated with memory-aware throttling (T), which dynamically reduces \reviithe GPU parallelism when page fault frequencies become high. Capacity compression (C) can enable larger working sets without increasing physical memory capacity. No single technique fits all workloads, and, thus, ETC integrates proactive eviction, memory-aware throttling and capacity compression into a principled framework that dynamically selects the most effective combination of techniques, transparently to the running software. To this end, ETC categorizes applications into three categories: regular applications without data sharing across kernels, regular applications with data sharing across kernels, and irregular applications. Our evaluation shows that ETC fully mitigates the oversubscription overhead for regular applications without data sharing and delivers performance similar to the ideal unlimited GPU memory baseline. We also show that ETC outperforms the state-of-the-art baseline by 60.4% and 270% for regular applications with data sharing and irregular applications, respectively.
Year
DOI
Venue
2019
10.1145/3297858.3304044
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems
Keywords
Field
DocType
gpgpu applications, graphics processing units, oversubscription, virtual memory management
Graphics,Latency (engineering),Computer science,Parallel computing,Data sharing,Thrashing,Software,Memory management,Page fault,Demand paging,Distributed computing
Conference
ISBN
Citations 
PageRank 
978-1-4503-6240-5
6
0.41
References 
Authors
0
7
Name
Order
Citations
PageRank
Chen Li1112.58
Rachata Ausavarungnirun278029.88
Christopher J. Rossbach347228.33
Youtao Zhang41977122.84
Onur Mutlu59446357.40
Yang Guo66732.72
Jun Yang72762241.66