Title
A Performance-Improved and Storage-Efficient Secondary Index for Big Data Processing
Abstract
There are billions of devices in smart grid nowadays. For every data source, it produces thousands of data records every day. Classical relational databases are malfunctioned when dealing with these large-scale data sets. Powerful big data platform is needed to process the information in private clouds of smart grid. HBase is a promising platform to solve these problems. However, finding an effective indexing scheme is still hard because most existing schemes which retrieve data by columns are time-consuming. In this paper, we present a refined secondary index scheme on HBase. It can not only accelerate query process but also save storage space. Experimental results show that when referring to join operation, our proposed indexing scheme provides a minimum 5.584x speedup to a maximum 571.360x speedup compared with a query scheme without any index and it provides a minimum 1.026x speedup to a maximum 4.761x speedup compared with a classical secondary index. Our proposed secondary index scheme is feasible and effective on both query performance and storage efficiency.
Year
DOI
Venue
2017
10.1109/SmartCloud.2017.32
2017 IEEE International Conference on Smart Cloud (SmartCloud)
Keywords
Field
DocType
Smart Grid,Secondary Index,HBase,Query-oriented,Storage-efficient,Speedup
Big data processing,Data mining,Data set,Smart grid,Relational database,Computer science,Search engine indexing,Storage efficiency,Big data,Speedup
Conference
ISBN
Citations 
PageRank 
978-1-5386-3685-5
0
0.34
References 
Authors
3
7
Name
Order
Citations
PageRank
han wu16210.14
Yongxin Zhu246658.07
chang wang33312.55
Junjie Hou4116.79
Mengjun Li524.08
Qixuan Xue601.35
Kedun Mao701.01