Title
From Theory to Practice: Efficient Join Query Evaluation in a Parallel Database System
Abstract
Big data analytics often requires processing complex queries using massive parallelism, where the main performance metrics is the communication cost incurred during data reshuffling. In this paper, we describe a system that can compute efficiently complex join queries, including queries with cyclic joins, on a massively parallel architecture. We build on two independent lines of work for multi-join query evaluation: a communication-optimal algorithm for distributed evaluation, and a worst-case optimal algorithm for sequential evaluation. We evaluate these algorithms together, then describe novel, practical optimizations for both algorithms.
Year
DOI
Venue
2015
10.1145/2723372.2750545
ACM SIGMOD Conference
Field
DocType
Citations 
Hash join,Query optimization,Joins,Massively parallel architecture,Computer science,Parallel database,Massively parallel,Sort-merge join,Theoretical computer science,Big data,Database
Conference
35
PageRank 
References 
Authors
0.92
29
3
Name
Order
Citations
PageRank
Shumo Chu129711.08
Magdalena Balazinska24513301.06
Dan Suciu396251349.54