Title
Large-scale incremental data processing with change propagation
Abstract
Incremental processing of large-scale data is an increasingly important problem, given that many processing jobs run repeatedly with similar inputs, and that the de facto standard programmingmodel (MapReduce) was not designed to efficiently process small updates. As a result, new systems specifically targeting this problem (e.g., Google Percolator, or Yahoo! CBP) have been proposed. Unfortunately, these approaches require the adoption of a new programming model, breaking compatibility with existing programs, and increasing the burden on the programmer, who now is required to devise an incremental update mechanism. We claim that automatic incremental processing of large-scale data is possible by leveraging previous results from the algorithms and programming languages communities. As an example, we describe how Map Reduce can be improved to efficiently handle small input changes by automatically incrementalizing existing MapReduce computations, without breaking backward compatibility or demanding programmers to adopt a new programming approach.
Year
Venue
Keywords
2011
HotCloud
new programming model,new programming approach,programming languages community,processing job,incremental processing,incremental update mechanism,large-scale incremental data,automatic incremental processing,change propagation,new system,large-scale data,existing mapreduce computation,data processing
DocType
Citations 
PageRank 
Conference
17
0.89
References 
Authors
10
5
Name
Order
Citations
PageRank
Pramod Bhatotia141428.94
Alexander Wieder224511.43
Istemi Ekin Akkus3686.96
Rodrigo Rodrigues4104953.56
Umut A. Acar571645.83