Title
Reservation-based Scheduling: If You're Late Don't Blame Us!
Abstract
The continuous shift towards data-driven approaches to business, and a growing attention to improving return on investments (ROI) for cluster infrastructures is generating new challenges for big-data frameworks. Systems originally designed for big batch jobs now handle an increasingly complex mix of computations. Moreover, they are expected to guarantee stringent SLAs for production jobs and minimize latency for best-effort jobs. In this paper, we introduce reservation-based scheduling, a new approach to this problem. We develop our solution around four key contributions: 1) we propose a reservation definition language (RDL) that allows users to declaratively reserve access to cluster resources, 2) we formalize planning of current and future cluster resources as a Mixed-Integer Linear Programming (MILP) problem, and propose scalable heuristics, 3) we adaptively distribute resources between production jobs and best-effort jobs, and 4) we integrate all of this in a scalable system named Rayon, that builds upon Hadoop / YARN. We evaluate Rayon on a 256-node cluster against workloads derived from Microsoft, Yahoo!, Facebook, and Cloud-era's clusters. To enable practical use of Rayon, we open-sourced our implementation as part of Apache Hadoop 2.6.
Year
DOI
Venue
2014
10.1145/2670979.2670981
SoCC
Keywords
Field
DocType
algorithms,design,distributed systems,scheduling,experimentation,measurement,performance
Reservation,Yarn,Return on investment,Scheduling (computing),Computer science,Computer network,Real-time computing,Heuristics,Linear programming,Batch processing,Scalability
Conference
Citations 
PageRank 
References 
44
1.32
28
Authors
6
Name
Order
Citations
PageRank
Carlo Curino1201290.35
Djellel Eddine Difallah249222.96
Chris Douglas366723.01
Subru Krishnan4796.36
Raghu Ramakrishnan5126492243.05
Sriram Rao644023.78