Abstract | ||
---|---|---|
Cloud intelligence applications often perform iterative computations (e.g., PageRank) on constantly changing data sets (e.g., Web graph). While previous studies extend MapReduce for efficient iterative computations, it is too expensive to perform an entirely new large-scale MapReduce iterative job to timely accommodate new changes to the underlying data sets. In this paper, we propose i2MapReduce to support incremental iterative computation. We observe that in many cases, the changes impact only a very small fraction of the data sets, and the newly iteratively converged state is quite close to the previously converged state. i2MapReduce exploits this observation to save re-computation by starting from the previously converged state, and by performing incremental updates on the changing data. Our preliminary result is quite promising. i2MapReduce sees significant performance improvement over re-computing iterative jobs in MapReduce. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1145/2501928.2501930 | Cloud-I |
Keywords | DocType | Citations |
new change,underlying data set,new large-scale MapReduce iterative,efficient iterative computation,data set,changes impact,iterative job,incremental iterative computation,iterative computation,incremental updates | Conference | 0 |
PageRank | References | Authors |
0.34 | 15 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yanfeng Zhang | 1 | 170 | 15.56 |
Shimin Chen | 2 | 560 | 29.44 |