Title | ||
---|---|---|
Understanding and optimizing packed neural network training for hyper-parameter tuning |
Abstract | ||
---|---|---|
ABSTRACTAs neural networks are increasingly employed in machine learning practice, how to efficiently share limited training resources among a diverse set of model training tasks becomes a crucial issue. To achieve better utilization of the shared resources, we explore the idea of jointly training multiple neural network models on a single GPU in this paper. We realize this idea by proposing a primitive, called pack. We further present a comprehensive empirical study of pack and end-to-end experiments that suggest significant improvements for hyperparameter tuning. The results suggest: (1) packing two models can bring up to 40% performance improvement over unpacked setups for a single training step and the improvement increases when packing more models; (2) the benefit of the pack primitive largely depends on a number of factors including memory capacity, chip architecture, neural network structure, and batch size; (3) there exists a trade-off between packing and unpacking when training multiple neural network models on limited resources; (4) a pack-aware Hyperband is up to 2.7X faster than the original Hyperband, with this improvement growing as memory size increases and subsequently the density of models packed. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1145/3462462.3468880 | International Conference on Management of Data |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Liu Rui | 1 | 0 | 0.34 |
S. Krishnan | 2 | 391 | 36.25 |
Aaron J. Elmore | 3 | 352 | 34.03 |
Michael J. Franklin | 4 | 17423 | 1681.10 |