Morpheus: Towards Automated SLOs for Enterprise Clusters. - Citegraph

Paper Info

Title
Morpheus: Towards Automated SLOs for Enterprise Clusters.

Abstract
Modern resource management frameworks for large-scale analytics leave unresolved the problematic tension between high cluster utilization and job's performance predictability--respectively coveted by operators and users. We address this in Morpheus, a new system that: 1) codifies implicit user expectations as explicit Service Level Objectives (SLOs), inferred from historical data, 2) enforces SLOs using novel scheduling techniques that isolate jobs from sharing-induced performance variability, and 3) mitigates inherent performance variance (e.g., due to failures) by means of dynamic reprovisioning of jobs. We validate these ideas against production traces from a 50k node cluster, and show that Morpheus can lower the number of deadline violations by 5× to 13×, while retaining cluster-utilization, and lowering cluster footprint by 14% to 28%. We demonstrate the scalability and practicality of our implementation by deploying Morpheus on a 2700-node cluster and running it against production-derived workloads.

Year	Venue	Field
2016	OSDI	Resource management,Service level objective,User expectations,Computer science,Scheduling (computing),Real-time computing,Operator (computer programming),Footprint,Analytics,Operating system,Scalability,Distributed computing
DocType	Citations	PageRank
Conference	21	0.72
References	Authors
24	11

Authors (11 rows)

Cited by (21 rows)

References (24 rows)

Name	Order	Citations	PageRank
Sangeetha Abdu Jyothi	1	48	5.74
Carlo Curino	2	2012	90.35
Ishai Menache	3	1022	52.56
Shravan Matthur Narayanamurthy	4	28	1.55
Alexey Tumanov	5	554	24.61
Jonathan Yaniv	6	100	4.74
Ruslan Mavlyutov	7	30	3.19
Iñigo Goiri	8	1039	49.27
Subru Krishnan	9	79	6.36
Janardhan Kulkarni	10	153	17.73
Sriram Rao	11	440	23.78

1