Title
A new analytical technique for designing provably efficient MapReduce schedulers
Abstract
With the rapid increase in size and number of jobs that are being processed in the MapReduce framework, efficiently scheduling jobs under this framework is becoming increasingly important. We consider the problem of minimizing the total flowtime of a sequence of jobs in the MapReduce framework, where the jobs arrive over time and need to be processed through both Map and Reduce procedures before leaving the system. We show that for this problem for non-preemptive tasks, no on-line algorithm can achieve a constant competitive ratio (defined as the ratio between the completion time of the online algorithm to the completion time of the optimal non-causal off-line algorithm). We then construct a slightly weaker metric of performance called the efficiency ratio. An online algorithm is said to achieve an efficiency ratio of γ when the flow-time incurred by that scheduler divided by the minimum flow-time achieved over all possible schedulers is almost surely less than or equal to γ. Under some weak assumptions, we then show a surprising property that, for the flow-time problem, any work-conserving scheduler has a constant efficiency ratio in both preemptive and nonpreemptive scenarios. More importantly, we are able to develop an online scheduler with a very small efficiency ratio (2), and through simulations we show that it outperforms the state-of-the-art schedulers.
Year
DOI
Venue
2013
10.1109/INFCOM.2013.6566956
INFOCOM
Keywords
Field
DocType
mapreduce framework,scheduling,constant efficiency ratio,work-conserving scheduler,parallel programming,online algorithm,mapreduce schedulers,nonpreemptive tasks,completion time,optimal noncausal off-line algorithm,flow-time problem,parallel algorithms,online scheduler,data handling,job scheduling,schedules,algorithm design and analysis,scheduling algorithms,writing
Online algorithm,Fixed-priority pre-emptive scheduling,Division (mathematics),Scheduling (computing),Parallel algorithm,Computer science,Parallel computing,Almost surely,Group method of data handling,Competitive analysis,Distributed computing
Conference
ISSN
ISBN
Citations 
0743-166X
978-1-4673-5944-3
26
PageRank 
References 
Authors
1.11
5
3
Name
Order
Citations
PageRank
Zheng Yousi1413.51
N. B. Shroff26994519.23
Prasun Sinha32527181.57