Data Quality support to on-the-fly data integration using Adaptive Query Processing - Citegraph

Paper Info

Title
Data Quality support to on-the-fly data integration using Adaptive Query Processing

Abstract
In dynamic, on-the-fly relational data integration settings, such as data mashups, there is a need to reconcile values heterogeneity across sources, in order to ensure consistency and completeness of the integrated data. In this scenario, the use of exact joins to match records across sources may lead to incomplete integration, while similarity joins, often advocated as a solution to this problem, is computationally expensive. In this paper we explore the use of adaptive query processing (AQP) techniques in order to combine exact (fast) and approximate (accurate) joins when perform- ing dynamic integration. The adaptive algorithm uses an an priori expectation of the join result size combined with the monitoring of join progress to statistically determine, at various points during query execution, which join operator should be used. Depending on its configuration, the algorithm can achieve various trade- offs between completeness of the join result, and query execution time. Our experimental results show that sensible savings in join execution time can be achieved in practice, at the expense of a modest reduction in result completeness.

Year	Venue	Keywords
2009	SEBD	data integrity,data quality,relational data
Field	DocType	Citations
Data integration,Query optimization,Web search query,Data mining,Joins,Data quality,Relational database,Computer science,Sargable,Adaptive algorithm	Conference	0
PageRank	References	Authors
0.34	8	5

Authors (5 rows)

Cited by (0 rows)

References (8 rows)

Name	Order	Citations	PageRank
Paolo Missier	1	1287	100.48
Alvaro A. A. Fernandes	2	904	77.71
Roald Lengu	3	5	0.82
Giovanna Guerrini	4	705	97.44
Marco Mesiti	5	830	72.53

1