HYBRIDJOIN for Near-Real-Time Data Warehousing - Citegraph

Paper Info

Title
HYBRIDJOIN for Near-Real-Time Data Warehousing

Abstract
An important component of near-real-time data warehouses is the near-real-time integration layer. One important element in near-real-time data integration is the join of a continuous input data stream with a disk-based relation. For high-throughput streams, stream-based algorithms, such as Mesh Join MESHJOIN, can be used. However, in MESHJOIN the performance of the algorithm is inversely proportional to the size of disk-based relation. The Index Nested Loop Join INLJ can be set up so that it processes stream input, and can deal with intermittences in the update stream but it has low throughput. This paper introduces a robust stream-based join algorithm called Hybrid Join HYBRIDJOIN, which combines the two approaches. A theoretical result shows that HYBRIDJOIN is asymptotically as fast as the fastest of both algorithms. The authors present performance measurements of the implementation. In experiments using synthetic data based on a Zipfian distribution, HYBRIDJOIN performs significantly better for typical parameters of the Zipfian distribution, and in general performs in accordance with the theoretical model while the other two algorithms are unacceptably slow under different settings.

Year	DOI	Venue
2011	10.4018/jdwm.2011100102	IJDWM
Keywords	Field	DocType
near-real-time data warehousing,near-real-time data warehouse,zipfian distribution,index nested loop join,hybrid join hybridjoin,continuous input data stream,near-real-time data integration,synthetic data,disk-based relation,high-throughput stream,mesh join meshjoin,data transformation,data warehousing,near real time	Data integration,Data warehouse,Data mining,Zipf's law,Computer science,Data stream,Synthetic data,Throughput,Nested loop join	Journal
Volume	Issue	ISSN
7	4	1548-3924
Citations	PageRank	References
10	0.61	27
Authors
3

Authors (3 rows)

Cited by (10 rows)

References (27 rows)

Name	Order	Citations	PageRank
Gill Dobbie	1	728	77.75
M. Asif Naeem	2	102	19.73
Gerald Weber	3	248	30.62

1