Title
CoBell: Runtime Prediction for Distributed Dataflow Jobs in Shared Clusters
Abstract
Distributed dataflow systems have been developed to help users analyze and process large datasets. While they make it easier for users to develop massively-parallel programs, users still have to choose the amount of resources for the execution of their jobs. Yet, users do not necessarily understand workload and system dynamics, while they often have constraints like runtime targets and budgets. Addressing this problem, systems have been developed that automatically select the required amount of resources to fulfill the users' constraints. However, interference with co-located workloads can introduce a significant variance into the runtimes of jobs and make accurate runtime prediction harder. This paper presents CoBell, a resource allocation system that incorporates information about co-located workloads to improve the runtime prediction for jobs in shared clusters. CoBell receives jobs from users with runtime and scale-out constraints and then reserves resources based on predicted runtimes. We implemented CoBell as a job submission tool for YARN. As such, it works with existing YARN cluster setups. The paper evaluates CoBell using five different distributed dataflow jobs, showing that using CoBell results in runtimes that do not violate the runtime constraints by more than 7.2%.
Year
DOI
Venue
2018
10.1109/CloudCom2018.2018.00029
2018 IEEE International Conference on Cloud Computing Technology and Science (CloudCom)
Keywords
Field
DocType
Scalable Data Analytics,Distributed Dataflows,Runtime Prediction,Resource Allocation,Cluster Management
Cluster (physics),Yarn,Computer science,Workload,Runtime prediction,Dataflow,Resource allocation,System dynamics,Distributed computing
Conference
ISSN
ISBN
Citations 
2330-2194
978-1-5386-7900-5
0
PageRank 
References 
Authors
0.34
19
4
Name
Order
Citations
PageRank
Ilya Verbitskiy152.12
Lauritz Thamsen2439.26
Thomas Renner3185.47
Odej Kao4106696.19