Title
A comparison of five probabilistic view-size estimation techniques in OLAP
Abstract
A data warehouse cannot materialize all possible views, hence we must estimate quickly, accurately, and reliably the size of views to determine the best candidates for materialization. Many available techniques for view-size estimation make particular statistical assumptions and their error can be large. Comparatively, unassuming probabilistic techniques are slower, but they estimate accurately and reliability very large view sizes using little memory. We compare five unassuming hashing-based view-size estimation techniques including Stochastic Probabilistic Counting and LogLog Probabilistic Counting. Our experiments show that only Generalized Counting, Gibbons-Tirthapura, and Adaptive Counting provide universally tight estimates irrespective of the sizeof the view; of those, only Adaptive Counting remains constantly fast as we increasethe memory budget.
Year
DOI
Venue
2007
10.1145/1317331.1317335
data warehousing and olap
Keywords
DocType
Volume
data warehouse,materialized views
Conference
abs/cs/0703058
Citations 
PageRank 
References 
10
0.69
6
Authors
2
Name
Order
Citations
PageRank
Kamel Aouiche123313.32
Daniel Lemire282152.14