Title
MEAD: Model-Based Vertical Auto-Scaling for Data Stream Processing
Abstract
The unpredictable variability of Data Stream Processing (DSP) application workloads calls for advanced mechanisms and policies for elastically scaling the processing capacity of DSP operators. Whilst many different approaches have been used to devise policies, most of the solutions have focused on data arrival rate and operator resource utilization as key metrics for auto-scaling. We here show that, under burstiness in the data flows, overly simple characterizations of the input stream can yet lead to very inaccurate performance estimations that affect such policies, resulting in sub-optimal resource allocation.We then present MEAD, a vertical auto-scaling solution that relies on online state-based representation of burstiness to drive resource allocation. We use in particular Markovian Arrival Processes (MAPs), which are composable with analytical queueing models, allowing us to efficiently predict performance at run-time under burstiness. We integrate MEAD in Apache Flink, and evaluate its benefits over simpler yet popular auto-scaling solutions, using both synthetic and real-world workloads. Differently from existing approaches, MEAD satisfies response time requirements under burstiness, while saving up to 50% CPU resources with respect to a static allocation.
Year
DOI
Venue
2021
10.1109/CCGrid51090.2021.00041
2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet Computing (CCGrid)
Keywords
DocType
ISBN
data stream processing,auto-scaling,workload characterization,Markovian Arrival Processes
Conference
978-1-7281-9587-2
Citations 
PageRank 
References 
0
0.34
19
Authors
4
Name
Order
Citations
PageRank
Gabriele Russo Russo1222.49
Valeria Cardellini21514106.12
Giuliano Casale353.88
Francesco Lo Presti4107378.83