Title
To default or not to default: exposing limitations to HBase cluster deployers.
Abstract
With the advent of sensor networks and portable devices, data has been produced rapidly and in great amount. As a result storing and processing Big Data, in combination with the advances in cloud and virtual infrastructures, pose interesting challenges. In our previous work, we studied these challenges with various experiments around different HBase cluster configurations and their impact on the performance of the cluster. A by-product of our experiments was that, in spite of advances in tooling support to set up and configure a Big Data cluster, the various tools are not always aligned to produce the optimal or near-optimal performance for data clusters. More specifically, we show that the default configuration values of state-of-the-art cluster deployers, including Cloudera, IBM BigInsights, Apache Hortonworks and the manual HBase deployment, do not take in to account the underlying infrastructure resulting in subpar performance.
Year
Venue
Field
2015
CASCON
Cluster (physics),IBM,Software deployment,Computer science,Wireless sensor network,Big data,Operating system,Cloud computing
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
3
5
Name
Order
Citations
PageRank
Roni Sandel100.34
Marios Fokaefs223118.28
Mark Shtern318018.51
Hamzeh Khazei400.34
Marin Litoiu52147128.80