Title
Exploiting large system dynamics for designing simple data center schedulers
Abstract
The number and size of data centers has seen a rapid growth in the last few years. It is no longer uncommon to see large data centers with thousands or even tens of thousands of machines. Hence, it is critical to develop scalable scheduling mechanisms for processing the enormous number of jobs handled by popular paradigms such as the MapReduce framework. This work explores the possibility of simplifying the scheduling procedure by exploiting the “largeness” of the data center system. Specifically, we consider the problem of minimizing the total flow time of a sequence of jobs under the MapReduce framework, where the jobs arrive over time and need to be processed through both Map and Reduce procedures before leaving the system. We show that any work-conserving scheduler is asymptotically optimal under a wide range of traffic loads, including the heavy traffic limit. Our results are shown for scenarios in which the tasks can be preempted and served in parallel over different machines, as well as scenarios when each task has to be served only on one machine and cannot be preempted. This result implies, somewhat surprisingly, that when we have a large number of machines, there is little to be gained by optimizing beyond ensuring that a scheduler should be work-conserving. For long running applications, we also study the relationship between the number of machines and total running time, and show sufficient conditions to guarantee the asymptotic optimality of work-conserving schedulers. Further, we run extensive simulations, that indeed verify that when the total number of machines is large, state-of-the-art work-conserving schedulers have similar and close-to-optimal delay performance.
Year
DOI
Venue
2015
10.1109/INFOCOM.2015.7218405
2015 IEEE Conference on Computer Communications (INFOCOM)
Keywords
Field
DocType
large system dynamics,data center scheduler design,scheduling mechanism,MapReduce framework,optimization,asymptotic optimality,work-conserving scheduler
Scheduling (computing),Computer science,Parallel computing,Flow time,Scheduling (procedure),System dynamics,Data center,Asymptotically optimal algorithm,Distributed computing,Scalability
Conference
ISSN
Citations 
PageRank 
0743-166X
1
0.37
References 
Authors
7
4
Name
Order
Citations
PageRank
Zheng Yousi1413.51
N. B. Shroff26994519.23
Srikant, R.36868544.90
Prasun Sinha42527181.57