Title
A Hybrid Join Algorithm on Top of Map Reduce
Abstract
Hadoop has shown great power in processing vast data in parallel. Hive, the database on Hadoop, enables more experts to process relational data by providing sql-like interface. However, Hive does not provide an efficient approach for join, a common but expensive operator in relational database. Due to the importance of join, this paper proposes a novel hybrid algorithm, HJA, which can help to automatically choose the relatively better one among several methods, divide and memory copy merge, Partition Join(PJ) and na茂ve Hive join. Experiments show that HJA can get best performance in most situations.
Year
DOI
Venue
2011
10.1109/SKG.2011.13
SKG
Keywords
Field
DocType
efficient approach,partition join,relational data,hybrid join algorithm,memory copy,relational database,map reduce,novel hybrid algorithm,great power,expensive operator,best performance,vast data,sql,hybrid algorithm,semantics,relational databases,parallel processing
SQL,Hash join,Data mining,Hybrid algorithm,Recursive join,Relational database,Computer science,Nested set model,Sort-merge join,Theoretical computer science,Block nested loop,Database
Conference
Citations 
PageRank 
References 
1
0.35
3
Authors
7
Name
Order
Citations
PageRank
Weisong Hu1625.76
Lili Ma2325.37
Xiaowei Liu3235.75
Hongwei Qi4204.00
Li Zha510.35
Huaming Liao6575.09
Yuezhuo Zhang7121.59