Title
Experimental Study on the Performance and Resource Utilization of Data Streaming Frameworks
Abstract
With the advent of the Internet of Things (IoT), data stream processing have gained increased attention due to the ever-increasing need to process heterogeneous and voluminous data streams. This work addresses the problem of selecting a correct stream processing framework for a given application to be executed within a specific physical infrastructure. For this purpose, we focus on a thorough comparative analysis of three data stream processing platforms – Apache Flink, Apache Storm, and Twitter Heron (the enhanced version of Apache Storm), that are chosen based on their potential to process both streams and batches in real-time. The goal of the work is to enlighten the cloud-clients and the cloud-providers with the knowledge of the choice of the resource-efficient and requirement-adaptive streaming platform for a given application so that they can plan during allocation or assignment of Virtual Machines for application execution. For the comparative performance analysis of the chosen platforms, we have experimented using 8-node clusters on Grid5000 experimentation testbed and have selected a wide variety of applications ranging from a conventional benchmark to sensor-based IoT application and statistical batch processing application. In addition to the various performance metrics related to the elasticity and resource usage of the platforms, this work presents a comparative study of the “green-ness” of the streaming platforms by analyzing their power consumption – one of the first attempts of its kind. The obtained results are thoroughly analyzed to illustrate the functional behavior of these platforms under different computing scenarios.
Year
DOI
Venue
2018
10.1109/CCGRID.2018.00029
2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)
Keywords
Field
DocType
Stream processing,Apache Flink,Apache Spark,Twitter Heron,Internet of Things
Data stream mining,Virtual machine,Task analysis,Computer science,Testbed,Batch processing,Stream processing,Benchmark (computing),Distributed computing,Cloud computing
Conference
ISBN
Citations 
PageRank 
978-1-5386-5816-1
1
0.35
References 
Authors
8
2
Name
Order
Citations
PageRank
Subarna Chatterjee11698.21
Christine Morin222626.78