Title
Rhythm: component-distinguishable workload deployment in datacenters
Abstract
Cloud service providers improve resource utilization by co-locating latency-critical (LC) workloads with best-effort batch (BE) jobs in datacenters. However, they usually treat an LC workload as a whole when allocating resources to BE jobs and neglect the different features of components of an LC workload. This kind of coarse-grained co-location method leaves a significant room for improvement in resource utilization. Based on the observation of the inconsistent interference tolerance abilities of different LC components, we propose a new abstraction called Servpod, which is a collection of a LC parts that are deployed on the same physical machine together, and show its merits on building a fine-grained co-location framework. The key idea is to differentiate the BE throughput launched with each LC Servpod, i.e., Servpod with high interference tolerance ability can be deployed along with more BE jobs. Based on Servpods, we present Rhythm, a co-location controller that maximizes the resource utilization while guaranteeing LC service's tail latency requirement. It quantifies the interference tolerance ability of each servpod through the analysis of tail-latency contribution. We evaluate Rhythm using LC services in forms of containerized processes and microservices, and find that it can improve the system throughput by 31.7%, CPU utilization by 26.2%, and memory bandwidth utilization by 34% while guaranteeing the SLA (service level agreement).
Year
DOI
Venue
2020
10.1145/3342195.3387534
EuroSys '20: Fifteenth EuroSys Conference 2020 Heraklion Greece April, 2020
DocType
ISBN
Citations 
Conference
978-1-4503-6882-7
4
PageRank 
References 
Authors
0.40
0
7
Name
Order
Citations
PageRank
Laiping Zhao1185.04
Yanan Yang251.42
Kaixuan Zhang340.40
Xiaobo Zhou46416.25
Tie Qiu589580.18
Keqiu Li61415162.02
Yungang Bao7282.96