Title
REEF: retainable evaluator execution framework
Abstract
In this demo proposal, we describe REEF, a framework that makes it easy to implement scalable, fault-tolerant runtime environments for a range of computational models. We will demonstrate diverse workloads, including extract-transform-load MapReduce jobs, iterative machine learning algorithms, and ad-hoc declarative query processing. At its core, REEF builds atop YARN (Apache Hadoop 2's resource manager) to provide retainable hardware resources with lifetimes that are decoupled from those of computational tasks. This allows us to build persistent (cross-job) caches and cluster-wide services, but, more importantly, supports high-performance iterative graph processing and machine learning algorithms. Unlike existing systems, REEF aims for composability of jobs across computational models, providing significant performance and usability gains, even with legacy code. REEF includes a library of interoperable data management primitives optimized for communication and data movement (which are distinct from storage locality). The library also allows REEF applications to access external services, such as user-facing relational databases. We were careful to decouple lower levels of REEF from the data models and semantics of systems built atop it. The result was two new standalone systems: Tang, a configuration manager and dependency injector, and Wake, a state-of-the-art event-driven programming and data movement framework. Both are language independent, allowing REEF to bridge the JVM and .NET.
Year
DOI
Venue
2013
10.14778/2536274.2536318
PVLDB
Keywords
Field
DocType
retainable evaluator execution framework,ad-hoc declarative query processing,high-performance iterative graph processing,interoperable data management primitive,reef application,computational task,configuration manager,data model,computational model,data movement,data movement framework
Data mining,Data modeling,Locality,Relational database,Computer science,Legacy code,Configuration management,Composability,Data management,Database,Scalability
Journal
Volume
Issue
ISSN
6
12
2150-8097
Citations 
PageRank 
References 
5
1.30
7
Authors
12
Name
Order
Citations
PageRank
Byung-Gon Chun13832234.37
Tyson Condie2116264.84
Carlo Curino3201290.35
Chris Douglas466723.01
Sergiy Matusevych5214.04
Brandon Myers6615.14
Shravan M. Narayanamurthy740719.83
Raghu Ramakrishnan8173.03
Sriram Rao944023.78
Josh Rosen1048618.77
Russell Sears11179985.12
Markus Weimer1282750.86