Title
An Efficient OLAP Query Algorithm Based on Dimension Hierarchical Encoding Storage and Shark.
Abstract
The on-line analytical processing (OLAP) queries always include multi-table joins and aggregation operations in their SQL clauses. As a result, how to reduce multi-table joins and effectively aggregate the query data with "big data" is the key issue for query processing. Therefore, the novel OLAP query algorithm is proposed in this paper based on the dimension hierarchical encoding (DHE) storage strategy with the In-Memory computing in Shark. With DHE and Shark, a star join with hierarchy level is mapped to a multidimensional range query on the fact table and the large-scale data by transformations and actions are computed on resilient distributed datasets (RDDs). The experimental results show that, compared with the data analysis operations in Hive, complex multi-table joins and I/O overhead are reduced by DHE and Shark. The query performance is greatly improved than that of the ordinary star schema.
Year
DOI
Venue
2014
10.1007/978-3-319-11897-0_21
ADVANCES IN SWARM INTELLIGENCE, ICSI 2014, PT II
Keywords
Field
DocType
Big data,Data warehouse,Dimension hierarchical encoding,Shark
SQL,Data warehouse,Data mining,Joins,Fact table,Star schema,Computer science,Range query (data structures),Algorithm,Theoretical computer science,Online analytical processing,Big data
Conference
Volume
ISSN
Citations 
8795
0302-9743
1
PageRank 
References 
Authors
0.34
10
2
Name
Order
Citations
PageRank
Shengqiang Yao110.34
Jieyue He212818.92