Title
An Entropy-Based Analytic Model For The Privacy-Preserving In Open Data
Abstract
In a Big Data era, a lot of open data set is published and shared with the public. That creates new services and business. However, the publication may cause a leakage problem of private information. In general, de-identification techniques are applied to the data before publication. The problem, however, has not been solved completely. Personal data can be obtained from the several sources such as Internet service and social media. In this situation, a de-identified open data may be simply joined with the leaked external data and it may result in a re-identification issue. We propose a new analytic model to measure the personal information leakage risk in the open data before publishing. The proposed model formulates the entropy-based re-identification risk to measure the privacy leakage risk. We also try to find the data utility measure by using the entropy while preserving the privacy. Based on both the risk and the utility measure, we propose the guideline for data open to the public. We show the guideline including the risk and utility measurement can be applicable with the empirical experiments.
Year
Venue
Keywords
2016
2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA)
privacy, open data, re-identification, information entropy, privacy preserving data mining
Field
DocType
Citations 
Data science,Data mining,Open data,Social media,Computer science,Personally identifiable information,Publishing,Information privacy,Analytic model,Private information retrieval,Big data
Conference
0
PageRank 
References 
Authors
0.34
10
3
Name
Order
Citations
PageRank
Soo-Hyung Kim119149.03
Changwook Jung200.34
Y.-J. Lee323376.22