Abstract | ||
---|---|---|
Data analytics are moving beyond the limits of a single data processing platform. A cross-platform query optimizer is necessary to enable applications to run their tasks over multiple platforms efficiently and in a platform-agnostic manner. For the optimizer to be effective, it must consider data movement costs across different data processing platforms. In this paper, we present the graph-based data movement strategy used by Rheem, our open-source cross-platform system. In particular, we (i) model the data movement problem as a new graph problem, which we prove to be NP-hard, and (ii) propose a novel graph exploration algorithm, which allows Rheem to discover multiple hidden opportunities for cross-platform data processing. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/ICDE.2019.00162 | 2019 IEEE 35th International Conference on Data Engineering (ICDE) |
Keywords | Field | DocType |
Data processing,Task analysis,Java,Data models,Sparks,Data structures,Query processing | Query optimization,Data structure,Data mining,Data modeling,Data processing,Task analysis,Data analysis,Computer science,Cross-platform,Java | Conference |
ISSN | ISBN | Citations |
1084-4627 | 978-1-5386-7474-1 | 0 |
PageRank | References | Authors |
0.34 | 0 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Sebastian Kruse | 1 | 51 | 8.03 |
Zoi Kaoudi | 2 | 215 | 18.55 |
Jorge-arnulfo Quiané-ruiz | 3 | 986 | 61.02 |
Sanjay Chawla | 4 | 1372 | 105.09 |
Felix Naumann | 5 | 1900 | 174.92 |
Bertty Contreras-Rojas | 6 | 1 | 2.46 |