Title
Optimizing Quality for Probabilistic Skyline Computation and Probabilistic Similarity Search (Extended Abstract)
Abstract
Probabilistic queries usually suffer from the noisy query result sets, due to data uncertainty. In this paper, we propose an efficient optimization framework, termed as QueryClean, for both probabilistic skyline computation and probabilistic similarity search. Its goal is to optimize query quality by selecting a group of uncertain objects to clean under limited resource available, where an entropy based quality function is leveraged. We develop an efficient index to organize the possible result sets of probabilistic queries, which is able to help avoid multiple probabilistic query evaluations over a large number of possible worlds for quality computation. Moreover, using two newly presented heuristics, we present exact and approximate algorithms for the optimization problem. Extensive experiments on both real and synthetic data sets demonstrate the efficiency and scalability of QueryClean.
Year
DOI
Venue
2019
10.1109/ICDE.2019.00259
2019 IEEE 35th International Conference on Data Engineering (ICDE)
Keywords
DocType
ISSN
Probabilistic logic,Cleaning,Optimization,Indexes,Heuristic algorithms,Data integrity,Uncertainty
Conference
1084-4627
ISBN
Citations 
PageRank 
978-1-5386-7474-1
0
0.34
References 
Authors
0
5
Name
Order
Citations
PageRank
Xiaoye Miao1537.53
Yunjun Gao286289.71
Linlin Zhou3282.44
Wei Wang4147958.62
Qing Li53222433.87