Title
Improved Correlated Sampling for Join Size Estimation
Abstract
Recent research on sampling-based join size estimation has focused on a promising new technique known as correlated sampling. While several variants of this technique have been proposed, there is a lack of a systematic study of this family of techniques. In this paper, we first introduce a framework to characterize its design space in terms of five parameters. Based on this framework, we propose a new correlated sampling based technique to address the limitations of existing techniques. Our new technique is based on using a discrete learning method for estimating the join size from samples. We experimentally compare the performance of multiple variants of our new technique and identify a hybrid variant that provides the best estimation quality. This hybrid variant not only outperforms the state-of-the-art correlated sampling technique, but it is also more robust to small samples and skewed data.
Year
DOI
Venue
2020
10.1109/ICDE48307.2020.00035
2020 IEEE 36th International Conference on Data Engineering (ICDE)
Keywords
DocType
ISSN
query processing,database systems,sampling methods
Conference
1063-6382
ISBN
Citations 
PageRank 
978-1-7281-2904-4
0
0.34
References 
Authors
21
2
Name
Order
Citations
PageRank
TaiNing Wang100.68
Chee Yong Chan2643199.24