Title
Reliable stream data processing for elastic distributed stream processing systems
Abstract
Distributed stream processing system (DSPS) has proven to be an effective way to process and analyze large-scale data streams in real-time fashions. The reliability problem of DSPS is becoming a popular topic in recent years. Novel elastic DSPSs provide the ability to seamlessly adapt to stream workload changes, which introduce new reliability challenges: (1) operators can be scaled up and down at runtime, requiring fault tolerant methods to maintain data backup consistency under the runtime dynamics. (2) Rollback recovery to the last checkpoint may undo recent auto-scaling adjustments, which will introduce high cost and unacceptable impact to the system. In this paper, we put forward a novel fault-tolerant mechanism to deal with these issues. In particular, we propose a self-adaptive backup unit, elastic data slice (EDS), that can partition and merge data backups according to operator auto-scaling at runtime. The consistency of recovery is guaranteed by new upstream backup protocols, which restart the system from the status after auto-scaling instead of last checkpoint and avoid high recovery latency. Based on them, we implement a prototype system named SPATE. Evaluations on SPATE show that our mechanism supports auto-scaling changes with similar overhead compared to existing approaches, while achieving low recovery latency despite auto-scaling.
Year
DOI
Venue
2020
10.1007/s10586-019-02939-9
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS
Keywords
DocType
Volume
Distributed stream processing system,Fault tolerance,Upstream backup
Journal
23.0
Issue
ISSN
Citations 
2.0
1386-7857
1
PageRank 
References 
Authors
0.37
0
4
Name
Order
Citations
PageRank
Xiaohui Wei139154.44
Yuan Zhuang210.37
Hongliang Li3192.73
Zhiliang Liu410.37