Efficient Online Aggregates in Dense-Region-Based Data Cube Representations - Citegraph

Paper Info

Title
Efficient Online Aggregates in Dense-Region-Based Data Cube Representations

Abstract
In-memory OLAP systems require a space-efficient representation of sparse data cubes in order to accommodate large data sets. On the other hand, most efficient online aggregation techniques, such as prefix sums, are built on dense array-based representations. These are often not applicable to real-world data due to the size of the arrays which usually cannot be compressed well, as most sparsity is removed during pre-processing. A possible solution is to identify dense regions in a sparse cube and only represent those using arrays, while storing sparse data separately, e.g. in a spatial index structure. Previous dense-region-based approaches have concentrated mainly on the effectiveness of the dense-region detection (i.e. on the space-efficiency of the result). However, especially in higher-dimensional cubes, data is usually more cluttered, resulting in a potentially large number of small dense regions, which negatively affects query performance on such a structure. In this paper, our focus is not only on space-efficiency but also on time-efficiency, both for the initial dense-region extraction and for queries carried out in the resulting hybrid data structure. We describe two methods to trade available memory for increased aggregate query performance. In addition, optimizations in our approach significantly reduce the time to build the initial data structure compared to former systems. Also, we present a straightforward adaptation of our approach to support multi-core or multi-processor architectures, which can further enhance query performance. Experiments with different real-world data sets show how various parameter settings can be used to adjust the efficiency and effectiveness of our algorithms.

Year	DOI	Venue
2010	10.1007/978-3-642-03730-6_15	Transactions on large-scale data- and knowledge-centered systems II
Keywords	Field	DocType
query performance,different real-world data set,hybrid data structure,initial data structure,large data set,sparse data,sparse data cube,spatial index structure,dense array-based representation,dense region,Dense-Region-Based Data Cube Representations,Efficient Online Aggregates	Data structure,Data mining,Data set,Computer science,Range query (data structures),Online aggregation,Online analytical processing,Database,Data cube,Sparse matrix,Spatial database	Journal
Volume	ISSN	ISBN
5691	0302-9743	3-642-16174-X
Citations	PageRank	References
1	0.38	15
Authors
2

Authors (2 rows)

Cited by (1 rows)

References (15 rows)

Name	Order	Citations	PageRank
Kais Haddadin	1	1	0.38
Tobias Lauer	2	37	8.65

1