Title
Optimizing Cross-Platform Data Movement
Abstract
Data analytics are moving beyond the limits of a single data processing platform. A cross-platform query optimizer is necessary to enable applications to run their tasks over multiple platforms efficiently and in a platform-agnostic manner. For the optimizer to be effective, it must consider data movement costs across different data processing platforms. In this paper, we present the graph-based data movement strategy used by Rheem, our open-source cross-platform system. In particular, we (i) model the data movement problem as a new graph problem, which we prove to be NP-hard, and (ii) propose a novel graph exploration algorithm, which allows Rheem to discover multiple hidden opportunities for cross-platform data processing.
Year
DOI
Venue
2019
10.1109/ICDE.2019.00162
2019 IEEE 35th International Conference on Data Engineering (ICDE)
Keywords
Field
DocType
Data processing,Task analysis,Java,Data models,Sparks,Data structures,Query processing
Query optimization,Data structure,Data mining,Data modeling,Data processing,Task analysis,Data analysis,Computer science,Cross-platform,Java
Conference
ISSN
ISBN
Citations 
1084-4627
978-1-5386-7474-1
0
PageRank 
References 
Authors
0.34
0
6
Name
Order
Citations
PageRank
Sebastian Kruse1518.03
Zoi Kaoudi221518.55
Jorge-arnulfo Quiané-ruiz398661.02
Sanjay Chawla41372105.09
Felix Naumann51900174.92
Bertty Contreras-Rojas612.46