Abstract | ||
---|---|---|
In many organizations different databases contain different kind of data concerning the same entity. This may have several good reasons. However, to have an integral and unified view of an entity, data reconciliation is of crucial importance. In this paper, we present an approach for data reconciliation that is based on available schemata of data sources and the content of the sources. The different schemata of data sources are used to determine what parts of the schemata pertain to the same entity type. The content of the sources is used to determine the association between different attributes stored in different sources. In establishing the relationships between different attributes, we have exploited the knowledge of domain experts as well. On the basis of the collected information with regard to a set of attributes, we assign a similarity measure to these attributes. Once we have identified the set of attributes that is similar, we reconcile two entities on the basis of the similarity measure. We illustrate the effectiveness of our approach by means of a real-life case in the field of police and justice. |
Year | Venue | Keywords |
---|---|---|
2009 | DG.O | data reconciliation,criminal justice chain,different schema,organizations different databases,entity type,different source,different kind,similarity measure,towards privacy,schemata pertain,data source,different attribute,criminal justice |
Field | DocType | Citations |
Information retrieval,Similarity measure,Computer science,Knowledge management,Criminal justice,Schema (psychology) | Conference | 4 |
PageRank | References | Authors |
0.97 | 6 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Sunil Choenni | 1 | 309 | 111.82 |
Jan van Dijk | 2 | 352 | 27.66 |