Title
Predicting and reining in application-level slowdown on spatial multitasking GPUs
Abstract
Predicting performance degradation of a GPU application at co-location on a spatial multitasking GPU without prior application knowledge is essential in public Clouds. Prior work mainly targets CPU co-location, and is inaccurate and/or inefficient for predicting performance of applications at co-location on spatial multitasking GPUs. Our investigation shows that hardware event statistics caused by co-located applications strongly correlate with their slowdowns. Based on this observation, we present Themis with a kernel slowdown model (Themis-KSM), which performs precise and efficient online application slowdown prediction without prior application knowledge. The kernel slowdown model is trained offline. When new applications co-run, Themis-KSM collects event statistics and predicts their slowdowns simultaneously. In addition, we also propose a two-stage slowdown prediction mechanism (Themis-TSP) for real-system GPUs without any hardware modification. Our evaluation shows that Themis has negligible runtime overhead, and both Themis-KSM and Themis-TSP can precisely predict application-level slowdown with prediction error smaller than 9.5% and 12.8%, respectively. Based on Themis, we also implement an SM allocation engine to rein in application slowdown at co-location. Case studies show that the engine successfully enforces fair sharing and QoS.
Year
DOI
Venue
2020
10.1016/j.jpdc.2020.03.009
Journal of Parallel and Distributed Computing
Keywords
DocType
Volume
Spatial Multitasking GPU,Sharing GPU,Co-location,Slowdown prediction,Performance prediction
Journal
141
ISSN
Citations 
PageRank 
0743-7315
3
0.38
References 
Authors
0
8
Name
Order
Citations
PageRank
Mengze Wei130.38
Wenyi Zhao230.38
Quan Chen317521.86
Hao Dai430.38
Jingwen Leng54912.97
Chao Li634437.85
Wenli Zheng741.09
Minyi Guo83514.13