A Framework for Reconciling Attribute Values from Multiple Data Sources - Citegraph

Paper Info

Title
A Framework for Reconciling Attribute Values from Multiple Data Sources

Abstract
Because of the heterogeneous nature of different data sources, data integration is often one of the most challenging tasks in managing modern information systems. While the existing literature has focused on problems such as schema integration and entity identification, it has largely overlooked a basic question: When an attribute value for a real-world entity is recorded differently in different databases, how should the “best” value be chosen from the set of possible values? This paper provides an answer to this question. We first show how a probability distribution over a set of possible values can be derived. We then demonstrate how these probabilities can be used to solve a given decision problem by minimizing the total cost of type I, type II, and misrepresentation errors. Finally, we propose a framework for integrating multiple data sources when a single “best” value has to be chosen and stored for every attribute of an entity.

Year	DOI	Venue
2007	10.1287/mnsc.1070.0745	Management Science
Keywords	Field	DocType
attribute value,multiple data sources,different databases,real-world entity,different data source,schema integration,entity identification,data integration,basic question,possible value,reconciling attribute values,multiple data source,probability distribution,data quality,type ii error,data integrity,difference in differences,information system,probabilistic database,type i error,decision problem	Information system,Data integration,Data mining,Decision problem,Data quality,Computer science,Variable and attribute,Probability distribution,Type I and type II errors,Schema (psychology)	Journal
Volume	Issue	ISSN
53	12	0025-1909
Citations	PageRank	References
11	0.64	14
Authors
4

Authors (4 rows)

Cited by (11 rows)

References (14 rows)

Name	Order	Citations	PageRank
Zhengrui Jiang	1	80	10.69
Sumit Sarkar	2	835	260.90
Prabuddha De	3	507	84.53
Debabrata Dey	4	456	206.82

1