Title
TransMR: data-centric programming beyond data parallelism
Abstract
MapReduce and related data-centric programming models have proven to be effective for a variety of large-scale distributed computations, in particular, those that manifest data parallelism. The fault-tolerance model underlying these programming environments relies on deterministic replay, which makes data-sharing (side-effects) across computations harder to support. This significantly limits the application scope of MapReduce and related models. This paper: (i) investigates data sharing (side-effects) in programming models operating on distributed key-value stores, specifically, the inconsistencies between the fault recovery mechanisms in execution and storage layers; (ii) defines semantics for a novel programming model, TransMR (Transactional MapReduce), which addresses these inconsistencies; and (iii) demonstrates broad application scope and enhanced performance through data-sharing across computations for a prototype implementation of the proposed semantics.
Year
Venue
Keywords
2011
HotCloud
manifest data parallelism,broad application scope,related data-centric programming model,proposed semantics,application scope,programming environment,fault-tolerance model,Transactional MapReduce,programming model,novel programming model
Field
DocType
Citations 
Database-centric architecture,Programming paradigm,Computer science,Parallel computing,Inductive programming,Data sharing,Real-time computing,Data parallelism,Reactive programming,Semantics,Computation,Distributed computing
Conference
5
PageRank 
References 
Authors
0.51
13
4
Name
Order
Citations
PageRank
Naresh Rapolu1272.15
Karthik Kambatla225711.61
Suresh Jagannathan343523.83
Ananth Grama41812136.25