Title
<italic>A-DSP:</italic> An Adaptive Join Algorithm for Dynamic Data Stream on Cloud System
Abstract
The join operations, including both equi and non-equi joins, are essential to the complex data analytics in the big data era. However, they are not inherently supported by existing DSPEs (Distributed Stream Processing Engines). The state-of-the-art join solutions on DSPEs rely on either complicated routing strategies or resource-inefficient processing structures, which are susceptible to dynamic workload, especially when the DSPEs face various join predicate operations and skewed data distribution. In this paper, we propose a new cost-effective stream join framework, named A-DSP (Adaptive Dimensional Space Processing), which enhances the adaptability of real-time join model and minimizes the resource used over the dynamic workloads. Our proposal includes: 1) a join model generation algorithm devised to adaptively switch between different join schemes so as to minimize the number of processing task required; 2) a load-balancing mechanism which maximizes the processing throughput; and 3) a lightweight algorithm designed for cutting down unnecessary migration cost. Extensive experiments are conducted to compare our proposal against state-of-the-art solutions on both benchmark and real-world workloads. The experimental results verify the effectiveness of our method, especially on reducing the operational cost under pay-as-you-go pricing scheme.
Year
DOI
Venue
2021
10.1109/TKDE.2019.2947055
IEEE Transactions on Knowledge and Data Engineering
Keywords
DocType
Volume
Distributed stream join,theta-join,cost effective
Journal
33
Issue
ISSN
Citations 
5
1041-4347
0
PageRank 
References 
Authors
0.34
0
6
Name
Order
Citations
PageRank
Junhua Fang1156.43
Rong Zhang235620.92
Yan Zhao3459.79
Kai Zheng493669.43
Xiaofang Zhou55381342.70
Aoying Zhou6117.96