Title
$k$ NN-DP: Handling Data Skewness in $kNN$ Joins Using MapReduce.
Abstract
In this study, we discover that the data skewness problem imposes adverse impacts on MapReduce-based parallel kNN-join operations running clusters. We propose a data partitioning approach-called kNN-DP-to alleviate load imbalance incurred by data skewness. The overarching goal of kNN-DP is to equally divide data objects into a large number of partitions, which are processed by mappers and reducers...
Year
DOI
Venue
2018
10.1109/TPDS.2017.2767596
IEEE Transactions on Parallel and Distributed Systems
Keywords
Field
DocType
Time complexity,Distributed databases,Silicon,Algorithm design and analysis,Optimization,Partitioning algorithms
Data mining,Joins,Skewness,Algorithm design,Computer science,Upper and lower bounds,Parallel computing,Distributed database,Time complexity,Scalability,Speedup
Journal
Volume
Issue
ISSN
29
3
1045-9219
Citations 
PageRank 
References 
2
0.37
0
Authors
3
Name
Order
Citations
PageRank
Xujun Zhao1304.31
Jifu Zhang29519.42
Xiao Qin31836125.69