Title
Toward optimal operator parallelism for stream processing topology with limited buffers
Abstract
Stream processing is an emerging in-memory computing paradigm to handle massive amounts of real-time data. It is vital to have a mechanism to propose proper parallelism for the operators to handle streaming data efficiently. Previous research has mostly focused on parallelism optimization with infinite buffers; however, the topology’s quality of service is severely affected by network buffers. Thus, in this paper, we introduce an extended queueing network to model the relationship between the parallelism and tuple’s average sojourn time with limited buffers. Based on this model, we also propose greedy algorithms to calculate the optimal parallelism for both the minimum latency and maximum throughput with resource constraints. To fairly evaluate the performance of different models, a random parameter generator for the streaming topology is presented. Experiments show that the extended queuing model may properly forecast performance. Compared to the state-of-the-art method, the proposed algorithms reduce the median total sojourn time by 3.74 times and increase the average maximum sustainable throughput by 1.69 times.
Year
DOI
Venue
2022
10.1007/s11227-022-04376-9
The Journal of Supercomputing
Keywords
DocType
Volume
Stream processing, Performance optimization, Resource optimization, Datastream management systems
Journal
78
Issue
ISSN
Citations 
11
0920-8542
0
PageRank 
References 
Authors
0.34
17
5
Name
Order
Citations
PageRank
Wenhao Li100.34
Zhan Zhang21910.81
Yanjun Shu300.34
Hongwei Liu4275.90
Tianming Liu500.34