Title
OC-DNN - Exploiting Advanced Unified Memory Capabilities in CUDA 9 and Volta GPUs for Out-of-Core DNN Training.
Abstract
Existing frameworks cannot train large DNNs that do not fit the GPU memory without explicit memory management schemes. In this paper, we propose OC-DNN - a novel Out-of-Core DNN training framework that exploits new Unified Memory features along with new hardware mechanisms in Pascal and Volta GPUs. OC-DNN has two major design components — 1) OC-Caffe; an enhanced version of Caffe that exploits innovative UM features like asynchronous prefetching, managed page-migration, exploitation of GPU-based page faults, and the cudaMemAdvise interface to enable efficient out-of-core training for very large DNNs, and 2) an interception library to transpar-ently leverage these cutting-edge features for other frameworks. We provide a comprehensive performance characterization of our designs. OC-Caffe provides comparable performance (to Caffe) for regular DNNs. OC-Caffe-Opt is up to 1.9X faster than OC-Caffe-Naive and up to 5X faster than optimized CPU-based training for out-of-core workloads. OC-Caffe also allows scale-up (DGX-1) and scale-out on multi-GPU clusters.
Year
DOI
Venue
2018
10.1109/HiPC.2018.00024
HiPC
Keywords
Field
DocType
Training,Graphics processing units,Memory management,Hardware,Prefetching,Resource management
Resource management,Asynchronous communication,Computer science,CUDA,Caffè,Parallel computing,Exploit,Out-of-core algorithm,Memory management,Page fault
Conference
ISSN
ISBN
Citations 
1094-7256
978-1-5386-8386-6
3
PageRank 
References 
Authors
0.48
0
5
Name
Order
Citations
PageRank
Ammar Ahmad Awan19110.84
Ching-Hsiang Chu26111.21
Hari Subramoni346650.51
Xiaoyi Lu460260.53
Dhabaleswar K. Panda55366446.70