Title
Efficient construction of histograms for multidimensional data using quad-trees
Abstract
Histograms can be useful in estimating the selectivity of queries in areas such as database query optimization and data exploration. In this paper, we propose a new histogram method for multidimensional data, called the Q-Histogram, based on the use of the quad-tree, which is a popular index structure for multidimensional data sets. The use of the compact representation of the target data obtainable from the quad-tree allows a fast construction of a histogram with the minimum number of scanning, i.e., only one scanning, of the underlying data. In addition to the advantage of computation time, the proposed method also provides a better performance than other existing methods with respect to the quality of selectivity estimation. We present a new measure of data skew for a histogram bucket, called the weighted bucket skew. Then, we provide an effective technique for skew-tolerant organization of histograms. Finally, we compare the accuracy and efficiency of the proposed method with other existing methods using both real-life data sets and synthetic data sets. The results of experiments show that the proposed method generally provides a better performance than other existing methods in terms of accuracy as well as computational efficiency.
Year
DOI
Venue
2011
10.1016/j.dss.2011.05.006
Decision Support Systems
Keywords
Field
DocType
data skew,real-life data set,multidimensional data,existing method,target data,multidimensional data set,synthetic data set,data exploration,efficient construction,better performance,data management,synthetic data,query optimization
Query optimization,Data mining,Histogram,Data set,Database query,Computer science,Skew,Data management,Computation,Quadtree
Journal
Volume
Issue
ISSN
52
1
Decision Support Systems
Citations 
PageRank 
References 
0
0.34
31
Authors
4
Name
Order
Citations
PageRank
Yohan J. Roh1212.87
Jae Ho Kim219722.06
Jin Hyun Son321718.21
Myoung Ho Kim41040273.40