Title
On the foundations of probabilistic information integration
Abstract
Information integration has been a subject of research for several decades and still remains a very active research area. Many new applications depend or benefit from large scale integration. Examples include large research projects in life sciences, need for data sharing among government agencies, reliance of corporations on business intelligence (which requires data integration from many heterogeneous sources), and integration of information on the web. The importance of information integration with uncertainty has been observed in recent years. Frequently, information from multiple sources are uncertain and possibly inconsistent. Further the process of integration often depends on approximate schema mappings, another source of uncertainty. An integration system is useful only to the extent that the information it produces can be trusted. Hence, providing a measure of certainty for integrated information is of crucial importance in many important applications. In this paper we study the problem of integration of uncertain information. We present a simple and intuitive approach to the representation and integration of uncertain information from multiple sources, and show that our integration approach coincides with a recent formalism for uncertain information integration. We extend the model to probabilistic possible-worlds, and show certain unintuitive constraints are imposed upon probabilities of possible-worlds of sources. In particular, we show the probabilities of possible worlds of a source are not independent, rather, they are dependent on probabilities of other sources. We study the problem of determining the probabilities for the result of integration. Finally, we present a practical approach to relaxing probabilistic constraints in integration.
Year
DOI
Venue
2012
10.1145/2396761.2396873
CIKM
Keywords
Field
DocType
large scale integration,uncertain information integration,integration system,data integration,active research area,uncertain information,information integration,probabilistic information integration,multiple source,integrated information,integration approach
Data integration,Data mining,Information integration,Computer science,Data sharing,Uncertain data,Probabilistic logic,Business intelligence,Schema (psychology),Possible world
Conference
Citations 
PageRank 
References 
3
0.38
40
Authors
1
Name
Order
Citations
PageRank
Fereidoon Sadri1846283.70