Title
Improving Scheduling Efficiency of Hadoop YARN Using AFSA Algorithm.
Abstract
Apache Hadoop is one of the most popular MapReduce framework for parallel processing of large data sets. As the job scheduler and resource manager, YARN plays a very important role. Schedulers on YARN are designed to minimize the makespan of MapReduce jobs. The performance of a scheduler in YARN depends not only on whether the resource capacity of the working nodes are fully utilized, but also on the dependencies among those tasks. Therefore it is very difficult to achieve an optimal solution. This paper proposes a new Hadoop YARN scheduling algorithm. The algorithm formalizes the problem as a multiple knapsack problem which takes into consideration of the resource cost and time cost of each task as well as the dependency between different tasks. Artificial Fish Swarm Algorithm is adopted to solve the knapsack optimization problem. The algorithm was implemented as a pluggable scheduler on the most recent version of Hadoop YARN and evaluated with several MapReduce benchmarks. The experimental results show that our scheduler could effectively reduce the makespan of Hadoop jobs by 30% compared with some existing scheduling policies.
Year
Venue
Field
2017
ISPA/IUCC
Resource management,Yarn,Job shop scheduling,Computer science,Scheduling (computing),Algorithm,Linear programming,Job scheduler,Knapsack problem,Optimization problem
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Jie Tang15871300.22
Junlei Gao200.34
Gangshan Wu327536.63