Abstract | ||
---|---|---|
With the implementation of mainstream DL frameworks, scarce GPU memory resource is the primary bottleneck that hinders the trainability and training efficiency of ultra-deep neural networks (UDNN). Prior memory optimization works focus on removing the trainability restriction but leave the training efficiency out of consideration. To fill the gap, we present "AccUDNN", an accelerator that aims to make full use of finite GPU memory resource to speed up the training process of UDNN in this paper. AccUDNN mainly includes two modules: memory optimizer and hyperparameter tuner. Memory optimizer develops a novel performance-model guided dynamic swap out/in strategy to meet trainability first and further remedy the efficiency degradation in other swapping strategies. Then, a hyperparameter tuner is designed to explore the efficiency-optimal minibatch size and the matched learning rate after applying the dynamic swapping strategy. Evaluations demonstrate that AccUDNN cuts down the GPU memory requirement of ResNet-152 from more than 24GB to 8GB. In turn, given 12GB GPU memory budget, the efficiency-optimal minibatch size can reach 4.2x larger than Caffe and finally improve the scaling efficiency (speedup) of 8 GPUs' cluster by 1.9x. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/ICCD46524.2019.00017 | 2019 IEEE 37th International Conference on Computer Design (ICCD) |
Keywords | Field | DocType |
ultra-deep neural network, GPU, memory optimization, speed up | Bottleneck,Hyperparameter,Computer science,CUDA,Caffè,Parallel computing,Artificial intelligence,Deep learning,Artificial neural network,Tuner,Speedup | Conference |
ISSN | ISBN | Citations |
1063-6404 | 978-1-7281-1215-2 | 0 |
PageRank | References | Authors |
0.34 | 14 | 8 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jinrong Guo | 1 | 7 | 2.55 |
Wantao Liu | 2 | 73 | 8.29 |
Lili Wang | 3 | 16 | 5.48 |
Chunrong Yao | 4 | 3 | 1.76 |
Jizhong Han | 5 | 355 | 54.72 |
Ruixuan Li | 6 | 405 | 69.47 |
Yijun Lu | 7 | 16 | 4.51 |
Songlin Hu | 8 | 126 | 30.82 |