Title
HaCube: Extending MapReduce for Efficient OLAP Cube Materialization and View Maintenance.
Abstract
Data cubes are widely used as a powerful tool to provide multi-dimensional views in data warehousing and On-Line Analytical Processing OLAP. However, with increasing data sizes, it is becoming computationally expensive to perform data cube analysis. In this paper, we introduce HaCube, an extension of MapReduce, designed for efficient parallel data cube computation on large-scale data. We also provide a general data cube materialization solution which is able to facilitate the features in MapReduce-like systems towards an efficient data cube computation. Furthermore, we demonstrate how HaCube supports view maintenance through either incremental computation e.g. used for SUM or COUNT or recomputation e.g. used for MEDIAN or CORRELATION. We implement HaCube by extending Hadoop and evaluate it based on the TPC-D benchmark over billions of tuples on a cluster with over 320 cores. The experimental results demonstrate the efficiency, scalability and practicality of HaCube for cube computation over a large amount of data in a distributed environment.
Year
DOI
Venue
2016
10.1007/978-3-319-32049-6_8
DASFAA
Field
DocType
Citations 
Data warehouse,Data mining,Tuple,Computer science,Parallel computing,OLAP cube,Online analytical processing,Data cube,Database,Cube,Computation,Scalability
Conference
1
PageRank 
References 
Authors
0.35
14
5
Name
Order
Citations
PageRank
Zhengkui Wang19110.46
Yan Chu2137.09
Kian-Lee Tan36962776.65
Divyakant Agrawal482011674.75
Amr El Abbadi567671569.95