Title | ||
---|---|---|
HaCube: Extending MapReduce for Efficient OLAP Cube Materialization and View Maintenance. |
Abstract | ||
---|---|---|
Data cubes are widely used as a powerful tool to provide multi-dimensional views in data warehousing and On-Line Analytical Processing OLAP. However, with increasing data sizes, it is becoming computationally expensive to perform data cube analysis. In this paper, we introduce HaCube, an extension of MapReduce, designed for efficient parallel data cube computation on large-scale data. We also provide a general data cube materialization solution which is able to facilitate the features in MapReduce-like systems towards an efficient data cube computation. Furthermore, we demonstrate how HaCube supports view maintenance through either incremental computation e.g. used for SUM or COUNT or recomputation e.g. used for MEDIAN or CORRELATION. We implement HaCube by extending Hadoop and evaluate it based on the TPC-D benchmark over billions of tuples on a cluster with over 320 cores. The experimental results demonstrate the efficiency, scalability and practicality of HaCube for cube computation over a large amount of data in a distributed environment. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1007/978-3-319-32049-6_8 | DASFAA |
Field | DocType | Citations |
Data warehouse,Data mining,Tuple,Computer science,Parallel computing,OLAP cube,Online analytical processing,Data cube,Database,Cube,Computation,Scalability | Conference | 1 |
PageRank | References | Authors |
0.35 | 14 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Zhengkui Wang | 1 | 91 | 10.46 |
Yan Chu | 2 | 13 | 7.09 |
Kian-Lee Tan | 3 | 6962 | 776.65 |
Divyakant Agrawal | 4 | 8201 | 1674.75 |
Amr El Abbadi | 5 | 6767 | 1569.95 |