Title
Constrained Skyline Query Processing against Distributed Data Sites
Abstract
The skyline of a multidimensional point set is a subset of interesting points that are not dominated by others. In this paper, we investigate constrained skyline queries in a large-scale unstructured distributed environment, where relevant data are distributed among geographically scattered sites. We first propose a partition algorithm that divides all data sites into incomparable groups such that the skyline computations in all groups can be parallelized without changing the final result. We then develop a novel algorithm framework called PaDSkyline for parallel skyline query processing among partitioned site groups. We also employ intragroup optimization and multifiltering technique to improve the skyline query processes within each group. In particular, multiple (local) skyline points are sent together with the query as filtering points, which help identify unqualified local skyline points early on a data site. In this way, the amount of data to be transmitted via network connections is reduced, and thus, the overall query response time is shortened further. Cost models and heuristics are proposed to guide the selection of a given number of filtering points from a superset. A cost-efficient model is developed to determine how many filtering points to use for a particular data site. The results of an extensive experimental study demonstrate that our proposals are effective and efficient.
Year
DOI
Venue
2011
10.1109/TKDE.2010.103
IEEE Trans. Knowl. Data Eng.
Keywords
Field
DocType
unqualified local skyline point,optimisation,particular data site,cost-efficient model,network connections,skyline computations,large-scale unstructured distributed environment,skyline query process,skyline point,multifiltering technique,constrained skyline query processing,skyline query,relevant data,novel algorithm framework,padskyline,information filtering,partition algorithm,multidimensional point set,skyline query processes,parallel skyline query processing,unqualified local skyline points,geographically scattered sites,overall query response time,skyline computation,constrained skyline query,intragroup optimization,filtering point,data site,distributed query processing.,filtering points,distributed data sites,query response time,distributed databases,query processing,data sites,parallel processing,silicon,mobile computing
Skyline,Partition problem,Data mining,Subset and superset,Data processing,Distributed Computing Environment,Computer science,Filter (signal processing),Theoretical computer science,Heuristics,Distributed database
Journal
Volume
Issue
ISSN
23
2
1041-4347
Citations 
PageRank 
References 
23
0.74
16
Authors
3
Name
Order
Citations
PageRank
Lijiang Chen130423.22
Bin Cui21843124.59
Hua Lu3138083.74