Title
Apache Hadoop YARN: yet another resource negotiator
Abstract
The initial design of Apache Hadoop [1] was tightly focused on running massive, MapReduce jobs to process a web crawl. For increasingly diverse companies, Hadoop has become the data and computational agorá---the de facto place where data and computational resources are shared and accessed. This broad adoption and ubiquitous usage has stretched the initial design well beyond its intended target, exposing two key shortcomings: 1) tight coupling of a specific programming model with the resource management infrastructure, forcing developers to abuse the MapReduce programming model, and 2) centralized handling of jobs' control flow, which resulted in endless scalability concerns for the scheduler. In this paper, we summarize the design, development, and current state of deployment of the next generation of Hadoop's compute platform: YARN. The new architecture we introduced decouples the programming model from the resource management infrastructure, and delegates many scheduling functions (e.g., task fault-tolerance) to per-application components. We provide experimental evidence demonstrating the improvements we made, confirm improved efficiency by reporting the experience of running YARN on production environments (including 100% of Yahoo! grids), and confirm the flexibility claims by discussing the porting of several programming frameworks onto YARN viz. Dryad, Giraph, Hoya, Hadoop MapReduce, REEF, Spark, Storm, Tez.
Year
DOI
Venue
2013
10.1145/2523616.2523633
SoCC
Keywords
Field
DocType
apache hadoop yarn,specific programming model,yarn viz,programming framework,resource management infrastructure,mapreduce programming model,resource negotiator,programming model,hadoop mapreduce,initial design,mapreduce job,apache hadoop,data center
Resource management,Spark (mathematics),Yarn,Programming paradigm,Scheduling (computing),Computer science,Control flow,Real-time computing,Porting,Operating system,Scalability
Conference
Citations 
PageRank 
References 
559
15.70
20
Authors
16
Search Limit
100559
Name
Order
Citations
PageRank
Vinod Kumar Vavilapalli155915.70
Arun C. Murthy255915.70
Chris Douglas366723.01
Sharad Agarwal455915.70
Mahadev Konar5108337.60
Robert Evans655915.70
Thomas Graves755915.70
Jason Lowe855915.70
Hitesh Shah957118.52
Siddharth Seth1088336.16
Bikas Saha11104538.02
Carlo Curino12201290.35
Owen O'Malley1356216.83
Sanjay R. Radia14213176.85
Benjamin Reed152665162.06
Eric Baldeschwieler1657317.19