Abstract | ||
---|---|---|
This paper presents the first Datalog evaluation engine for executing graph analytics over BSP-style graph processing engines. Building on recent advances in Datalog that support efficient evaluation of aggregates functions, it is now easy for data scientists to author many important graph algorithms succinctly. Without the burden of low-level parallelization and optimization, data scientists can avoid programming to the quirks of the latest high-performance distributed computing framework. Where prior approaches build bespoke evaluation engines or modify generalized dataflow processing engines to achieve performance, this work shows how to efficiently evaluate Datalog directly on BSP-style graph processing engines such as Giraph. Datalography incorporates both traditional Datalog optimizations, such as semi-naive evaluation, and new evaluation algorithms and optimization techniques for efficient distributed evaluation of Datalog queries on graph processing engines. In particular we develop evaluation techniques that take advantage of super vertices, eager aggregation, and asynchronous execution to optimize graph processing on Pregel-like systems. We implement our algorithms on top of Apache Giraph and our results indicate that Datalography competes with native, tuned implementations, with some analytics running up to 9 times faster. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1109/BigData.2016.7840589 | 2016 IEEE International Conference on Big Data (Big Data) |
Keywords | Field | DocType |
datalography,Datalog graph analytics,Datalog evaluation engine,aggregate function evaluation,high-performance distributed computing framework,BSP-style graph processing engines,Datalog optimizations,seminaive evaluation,Datalog queries,graph processing engines,super vertices,eager aggregation,asynchronous execution,Pregel-like systems,Apache Giraph | Data mining,Computer science,Theoretical computer science,Dataflow,Artificial intelligence,Distributed database,Analytics,Graph theory,Asynchronous communication,Bespoke,Graph database,Datalog,Machine learning | Conference |
ISBN | Citations | PageRank |
978-1-4673-9006-4 | 3 | 0.39 |
References | Authors | |
15 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Walaa Eldin Moustafa | 1 | 35 | 2.98 |
Vicky Papavasileiou | 2 | 3 | 0.39 |
Ken Yocum | 3 | 644 | 67.41 |
Alin Deutsch | 4 | 2267 | 247.45 |