Title
FlexAnalytics: A Flexible Data Analytics Framework for Big Data Applications with I/O Performance Improvement
Abstract
Increasingly larger scale applications are generating an unprecedented amount of data. However, the increasing gap between computation and I/O capacity on High End Computing machines makes a severe bottleneck for data analysis. Instead of moving data from its source to the output storage, in-situ analytics processes output data while simulations are running. However, in-situ data analysis incurs much more computing resource contentions with simulations. Such contentions severely damage the performance of simulation on HPE. Since different data processing strategies have different impact on performance and cost, there is a consequent need for flexibility in the location of data analytics. In this paper, we explore and analyze several potential data-analytics placement strategies along the I/O path. To find out the best strategy to reduce data movement in given situation, we propose a flexible data analytics (FlexAnalytics) framework in this paper. Based on this framework, a FlexAnalytics prototype system is developed for analytics placement. FlexAnalytics system enhances the scalability and flexibility of current I/O stack on HEC platforms and is useful for data pre-processing, runtime data analysis and visualization, as well as for large-scale data transfer. Two use cases – scientific data compression and remote visualization – have been applied in the study to verify the performance of FlexAnalytics. Experimental results demonstrate that FlexAnalytics framework increases data transition bandwidth and improves the application end-to-end transfer performance.
Year
DOI
Venue
2014
10.1016/j.bdr.2014.07.001
Big Data Research
Keywords
DocType
Volume
I/O bottlenecks,In-situ analytics,Data preparation,Big data,High-end computing
Journal
1
ISSN
Citations 
PageRank 
2214-5796
18
0.86
References 
Authors
26
4
Name
Order
Citations
PageRank
Hongbo Zou1342.59
Yongen Yu2804.55
Wei Tang315210.65
Hsuan-Wei Michelle Chen41125.74