Abstract | ||
---|---|---|
Performance and resource optimization is an important research problem in data intensive distributed comput- ing. We present a new batched stream processing model that captures query correlations to expose I/O and com- putation redundancies for optimizations. The model is inspired by our empirical study on a trace from a pro- duction large-scale data processing cluster, which reveals significant redundancies caused by strong temporal and spatial correlations among queries. We have developed Comet, a query processing system that embraces the batched stream processing model for optimizations. We have integrated Comet with DryadLINQ. With its roots in query optimizations for database systems, Comet enables a set of new heuristics and opportunities tailored for distributed computing in DryadLINQ. Optimizations in Comet are effective. The evaluation of a micro-benchmark on a 40-machine clus- ter shows a 42% reduction in total machine time and over 40% reduction in total I/O. Our simulation on a real trace covering over 19 million machine hours shows an esti- mated I/O saving of over 50%. |
Year | DOI | Venue |
---|---|---|
2010 | 10.1145/1807128.1807139 | SoCC |
Keywords | Field | DocType |
batch computation,traditional batch processing model,incrementally bulk-appended data stream,effective query optimizations,batched stream processing,o reduction,o saving,40-node cluster,large-scale production data-processing cluster,query processing system,database system,distributed computing,empirical study,query optimization,data processing,spatial correlation,batch process,resource management,resource manager,stream processing | Resource management,Data stream mining,Data processing,Computer science,Parallel computing,Real-time computing,Comet,Batch processing,Stream processing,Computation | Conference |
Citations | PageRank | References |
72 | 5.27 | 28 |
Authors | ||
7 |
Name | Order | Citations | PageRank |
---|---|---|---|
Bingsheng He | 1 | 2810 | 179.09 |
Mao Yang | 2 | 496 | 30.94 |
Zhenyu Guo | 3 | 512 | 39.61 |
Rishan Chen | 4 | 326 | 17.81 |
Bing Su | 5 | 81 | 6.31 |
Wei Lin | 6 | 229 | 24.46 |
Lidong Zhou | 7 | 2136 | 147.82 |