Title
Optimal Operator Replication and Placement for Distributed Stream Processing Systems.
Abstract
Exploiting on-the-fly computation, Data Stream Processing (DSP) applications are widely used to process unbounded streams of data and extract valuable information in a near real-time fashion. As such, they enable the development of new intelligent and pervasive services that can improve our everyday life. To keep up with the high volume of daily produced data, the operators that compose a DSP application can be replicated and placed on multiple, possibly distributed, computing nodes, so to process the incoming data flow in parallel. Moreover, to better exploit the abundance of diffused computational resources (e.g., Fog computing), recent trends investigate the possibility of decentralizing the DSP application placement. In this paper, we present and evaluate a general formulation of the optimal DSP replication and placement (ODRP) as an integer linear programming problem, which takes into account the heterogeneity of application requirements and infrastructural resources. We integrate ODRP as prototype scheduler in the Apache Storm DSP framework. By leveraging on the DEBS 2015 Grand Challenge as benchmark application, we show the benefits of a joint optimization of operator replication and placement and how ODRP can optimize different QoS metrics, namely response time, internode traffic, cost, availability, and a combination thereof.
Year
DOI
Venue
2017
10.1145/3092819.3092823
SIGMETRICS Performance Evaluation Review
Field
DocType
Volume
Digital signal processing,Computer science,Quality of service,Real-time computing,Exploit,Integer programming,Throughput,Stream processing,Wireless sensor network,Data flow diagram,Distributed computing
Journal
44
Issue
Citations 
PageRank 
4
9
0.47
References 
Authors
25
4
Name
Order
Citations
PageRank
Valeria Cardellini11514106.12
Vincenzo Grassi2174681.24
Francesco Lo Presti3107378.83
Matteo Nardelli4777.95