Abstract | ||
---|---|---|
In this study, we discover that the data skewness problem imposes adverse impacts on MapReduce-based parallel kNN-join operations running clusters. We propose a data partitioning approach-called kNN-DP-to alleviate load imbalance incurred by data skewness. The overarching goal of kNN-DP is to equally divide data objects into a large number of partitions, which are processed by mappers and reducers... |
Year | DOI | Venue |
---|---|---|
2018 | 10.1109/TPDS.2017.2767596 | IEEE Transactions on Parallel and Distributed Systems |
Keywords | Field | DocType |
Time complexity,Distributed databases,Silicon,Algorithm design and analysis,Optimization,Partitioning algorithms | Data mining,Joins,Skewness,Algorithm design,Computer science,Upper and lower bounds,Parallel computing,Distributed database,Time complexity,Scalability,Speedup | Journal |
Volume | Issue | ISSN |
29 | 3 | 1045-9219 |
Citations | PageRank | References |
2 | 0.37 | 0 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Xujun Zhao | 1 | 30 | 4.31 |
Jifu Zhang | 2 | 95 | 19.42 |
Xiao Qin | 3 | 1836 | 125.69 |