Title
Preserving privacy whilst integrating data: Applied to criminal justice
Abstract
For many standard as well as emerging criminal law Web 2.0 applications, such as the development of mashups and dataspace systems, privacy preserving data integration is of crucial importance. In many organizations different databases contain different kinds of data concerning the same entity. This may have several good reasons. However, to have an integral and unified view of an entity, data reconciliation is of crucial importance. In this paper, we present an approach for data reconciliation that is based on available schemata of data sources and the content of the sources. The different schemata of data sources are used to determine what parts of the schemata pertain to the same entity type. The content of the sources is used to determine the association between different attributes stored in different sources. In establishing the relationships between different attributes, we have exploited the knowledge of domain experts as well. On the basis of the collected information, we identify a common set of attributes with regard to the data sources. A similarity function is associated to each attribute, which takes a record from each data source as input and computes a similarity value as output expressing how "similar" the records are. Depending on the similarity value, we decide whether or not to reconcile two entities. We illustrate the effectiveness of our approach by means of a real-life case in the field of police and justice. Our approach can be applied to support the development of a wide variety of criminal law applications, such as data warehouses, mashups, and dataspace systems.
Year
DOI
Venue
2010
10.3233/IP-2010-0202
Information Polity
Keywords
Field
DocType
data reconciliation,preserving privacy whilst,different schema,data warehouse,data integration,criminal justice,crucial importance,different kind,similarity value,dataspace system,data source,different attribute,criminal law,mashups
Data warehouse,Data integration,Data science,Data Applied,Mashup,World Wide Web,Similarity measure,Computer science,Criminal law,Criminal justice,Schema (psychology)
Journal
Volume
Issue
Citations 
15
1
17
PageRank 
References 
Authors
1.10
9
3
Name
Order
Citations
PageRank
Sunil Choenni1309111.82
Jan van Dijk235227.66
Frans Leeuw3181.82