Title
Efficient Online Aggregates in Dense-Region-Based Data Cube Representations
Abstract
In-memory OLAP systems require a space-efficient representation of sparse data cubes in order to accommodate large data sets. On the other hand, most efficient online aggregation techniques, such as prefix sums, are built on dense array-based representations. These are often not applicable to real-world data due to the size of the arrays which usually cannot be compressed well, as most sparsity is removed during pre-processing. A possible solution is to identify dense regions in a sparse cube and only represent those using arrays, while storing sparse data separately, e.g. in a spatial index structure. Previous dense-region-based approaches have concentrated mainly on the effectiveness of the dense-region detection (i.e. on the space-efficiency of the result). However, especially in higher-dimensional cubes, data is usually more cluttered, resulting in a potentially large number of small dense regions, which negatively affects query performance on such a structure. In this paper, our focus is not only on space-efficiency but also on time-efficiency, both for the initial dense-region extraction and for queries carried out in the resulting hybrid data structure. We describe two methods to trade available memory for increased aggregate query performance. In addition, optimizations in our approach significantly reduce the time to build the initial data structure compared to former systems. Also, we present a straightforward adaptation of our approach to support multi-core or multi-processor architectures, which can further enhance query performance. Experiments with different real-world data sets show how various parameter settings can be used to adjust the efficiency and effectiveness of our algorithms.
Year
DOI
Venue
2010
10.1007/978-3-642-03730-6_15
Transactions on large-scale data- and knowledge-centered systems II
Keywords
Field
DocType
query performance,different real-world data set,hybrid data structure,initial data structure,large data set,sparse data,sparse data cube,spatial index structure,dense array-based representation,dense region,Dense-Region-Based Data Cube Representations,Efficient Online Aggregates
Data structure,Data mining,Data set,Computer science,Range query (data structures),Online aggregation,Online analytical processing,Database,Data cube,Sparse matrix,Spatial database
Journal
Volume
ISSN
ISBN
5691
0302-9743
3-642-16174-X
Citations 
PageRank 
References 
1
0.38
15
Authors
2
Name
Order
Citations
PageRank
Kais Haddadin110.38
Tobias Lauer2378.65