A Cache-Based Semi-Stream Join To Deal With Unmatched Stream Data - Citegraph

Paper Info

Title
A Cache-Based Semi-Stream Join To Deal With Unmatched Stream Data

Abstract
In Data Stream Management System (DSMS) semi-stream processing has become a popular area of research due to the high demand of applications (e.g. real-time data warehousing) for up-to-date information. One common operation in semi-stream processing is joining of incoming stream with disk-based master data. A recent algorithm called CACHEJOIN was proposed to implement this join operation. However, CACHEJOIN loads entire stream data into join module and consumes all its resources without eliminating those stream tuples which have no relevant tuples in disk-based master data. Due to this, the performance of CACHEJOIN remains suboptimal. In this paper we present a revised version of CACHEJOIN called Improved CACHEJOIN which removes this limitation. This reduces the processing cost for the new algorithm and as a consequence, the new algorithm outperforms existing CACHEJOIN significantly. In order to quantify the performance differences, we compare both algorithms using both synthetic and real datasets with a known skewed distribution. We also present the cost model for our new algorithm.

Year	DOI	Venue
2015	10.1007/978-3-319-19548-3_5	DATABASES THEORY AND APPLICATIONS
Keywords	Field	DocType
Unmatched stream data, Semi-Stream joins, Performance optimization, Data transformation	Data warehouse,Data stream management system,Data mining,Computer science,Tuple,Cache,Master data,Stream data,Database	Conference
Volume	ISSN	Citations
9093	0302-9743	0
PageRank	References	Authors
0.34	9	3

Authors (3 rows)

Cited by (0 rows)

References (9 rows)

Name	Order	Citations	PageRank
M. Asif Naeem	1	102	19.73
Imran Sarwar Bajwa	2	87	22.31
Noreen Jamil	3	12	6.01

1