Title
Processing of probabilistic skyline queries using MapReduce
Abstract
There has been an increased growth in a number of applications that naturally generate large volumes of uncertain data. By the advent of such applications, the support of advanced analysis queries such as the skyline and its variant operators for big uncertain data has become important. In this paper, we propose the effective parallel algorithms using MapReduce to process the probabilistic skyline queries for uncertain data modeled by both discrete and continuous models. We present three filtering methods to identify probabilistic non-skyline objects in advance. We next develop a single MapReduce phase algorithm PS-QP-MR by utilizing space partitioning based on a variant of quadtrees to distribute the instances of objects effectively and the enhanced algorithm PS-QPF-MR by applying the three filtering methods additionally. We also propose the workload balancing technique to balance the workload of reduce functions based on the number of machines available. Finally, we present the brute-force algorithms PS-BR-MR and PS-BRF-MR with partitioning randomly and applying the filtering methods. In our experiments, we demonstrate the efficiency and scalability of PS-QPF-MR compared to the other algorithms.
Year
DOI
Venue
2015
10.14778/2824032.2824040
Proceedings of The Vldb Endowment
Field
DocType
Volume
Space partitioning,Skyline,Data mining,Computer science,Workload,Parallel algorithm,Filter (signal processing),Uncertain data,Probabilistic logic,Database,Scalability
Journal
8
Issue
ISSN
Citations 
12
2150-8097
7
PageRank 
References 
Authors
0.44
24
3
Name
Order
Citations
PageRank
Yoonjae Park1773.33
Jun-Ki Min268846.57
Kyuseok Shim35120752.19