Abstract | ||
---|---|---|
MapReduce and related data-centric programming models have proven to be effective for a variety of large-scale distributed computations, in particular, those that manifest data parallelism. The fault-tolerance model underlying these programming environments relies on deterministic replay, which makes data-sharing (side-effects) across computations harder to support. This significantly limits the application scope of MapReduce and related models. This paper: (i) investigates data sharing (side-effects) in programming models operating on distributed key-value stores, specifically, the inconsistencies between the fault recovery mechanisms in execution and storage layers; (ii) defines semantics for a novel programming model, TransMR (Transactional MapReduce), which addresses these inconsistencies; and (iii) demonstrates broad application scope and enhanced performance through data-sharing across computations for a prototype implementation of the proposed semantics. |
Year | Venue | Keywords |
---|---|---|
2011 | HotCloud | manifest data parallelism,broad application scope,related data-centric programming model,proposed semantics,application scope,programming environment,fault-tolerance model,Transactional MapReduce,programming model,novel programming model |
Field | DocType | Citations |
Database-centric architecture,Programming paradigm,Computer science,Parallel computing,Inductive programming,Data sharing,Real-time computing,Data parallelism,Reactive programming,Semantics,Computation,Distributed computing | Conference | 5 |
PageRank | References | Authors |
0.51 | 13 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Naresh Rapolu | 1 | 27 | 2.15 |
Karthik Kambatla | 2 | 257 | 11.61 |
Suresh Jagannathan | 3 | 435 | 23.83 |
Ananth Grama | 4 | 1812 | 136.25 |