Title
Star join revisited: Performance internals for cluster architectures
Abstract
Data warehouse workloads are crucial for the support of on-line analytical processing (OLAP). The strategy to cope with OLAP queries on such huge amounts of data calls for the use of large parallel computers. The trend today is to use cluster architectures that show a reasonable balance between cost and performance. In such cases, it is necessary to tune the applications in order to minimize the amount of I/O and communication, such that the global execution time is reduced as much as possible. In this paper, we model and analyze the most up-to-date strategies for ad hoc star join query processing in a cluster of computers. We show that, for ad hoc query processing and assuming a limited amount of resources available, these strategies still have room for improvement both in terms of I/O and inter-node data traffic communication. Our analysis concludes with the proposal of a hybrid solution that improves these two aspects compared to the previous techniques, and shows near optimal results in a broad spectrum of cases.
Year
DOI
Venue
2007
10.1016/j.datak.2007.06.008
Data Knowl. Eng.
Keywords
Field
DocType
olap query,broad spectrum,huge amount,inter-node data traffic communication,on-line analytical processing,cluster architecture,data warehouse workloads,query processing,performance internal,data call,limited amount,parallel computer,data warehouses,spectrum,data warehouse,star join
Data warehouse,Data mining,Data traffic,Star schema,Computer science,Execution time,Online analytical processing,Database,Distributed computing
Journal
Volume
Issue
ISSN
63
3
0169-023X
Citations 
PageRank 
References 
3
0.39
20
Authors
4
Name
Order
Citations
PageRank
Josep Aguilar-Saborit1868.01
Victor Muntés-Mulero220422.79
Calisto Zuzarte326031.97
Josep-L. Larriba-Pey416217.44