Title
Robust and Efficient Large-Large Table Outer Joins on Distributed Infrastructures.
Abstract
Outer joins are ubiquitous in many workloads but are sensitive to load-balancing problems. Current approaches mitigate such problems caused by data skew by using (partial) replication. However, contemporary replication-based approaches (1) introduce overhead, since they usually result in redundant data movement, (2) are sensitive to parameter tuning and value of data skew and (3) typically require that one side is small. In this paper, we propose a novel parallel algorithm, Redistribution and Efficient Query with Counters (REQC), aimed at robustness in terms of size of join sides, variation in skew and parameter tuning. Experimental results demonstrate that our algorithm is faster, more robust and less demanding in terms of network bandwidth, compared to the state-of-the-art.
Year
DOI
Venue
2014
10.1007/978-3-319-09873-9_22
Lecture Notes in Computer Science
Field
DocType
Volume
Joins,Parallel algorithm,Computer science,Parallel computing,Robustness (computer science),Bandwidth (signal processing),Skew,Distributed computing
Conference
8632
ISSN
Citations 
PageRank 
0302-9743
11
0.54
References 
Authors
20
4
Name
Order
Citations
PageRank
Long Cheng19116.99
Spyros Kotoulas259046.46
Tomas E. Ward310419.10
Georgios Theodoropoulos433231.39