Title
Anonymization and Analysis of Horizontally and Vertically Divided User Profile Databases with Multiple Sensitive Attributes
Abstract
Preventing the identification of individuals is important when data analyzers have to guarantee the safety of the data analysis they work with. A method proposed to solve this problem entails altering a part of the data value or deleting it. As to the processes, attributes of the individual data are divided into three groups: identifier (ID), quasi-identifier (QID), and sensitive attribute (SA). ID is the data that identify an individual directly, such as name. QID is the attributes that could identify an individual by combining them, such as age and birthplace. SA is very important information and should not be exposed when the data is identified to an individual. Utilizing these concepts, a safety metric for the data, such as l-diversity, is proposed so far. Under l-diversity, we use the assumption that the SA value is not known for anyone, and we process the data to prevent attackers from identifying. However, there are scenarios in which existing methods cannot protect the data against an invasion of privacy. In an analysis completed by multiple organizations, they integrated their data to carry out the effective data research. Although they can obtain profitable results, the integrated data could include information that attackers use to identify people. Specifically speaking, if the attacker is an institute providing data, they can use their own data' SA value as a QID value. The assumption of l-diversity is violated, so the existing safety metric loses its effect on protecting data. In this paper, we propose a new anonymization method to conceal organizations' important data by inserting dummy values, thereby enabling analysts to use the data safely. At the same time, we provide a calculating method to decrease the influence of the noise generated from the dummy insertion. We confirm these methods' effectiveness by measuring accuracy in a data analysis experiments.
Year
DOI
Venue
2018
10.1109/SOLI.2018.8476730
2018 IEEE International Conference on Service Operations and Logistics, and Informatics (SOLI)
Keywords
DocType
ISBN
privacy preserving data mining,privacy
Conference
978-1-5386-4523-9
Citations 
PageRank 
References 
0
0.34
4
Authors
4
Name
Order
Citations
PageRank
Yuki Ina100.34
Yuichi Sei2127.26
Yasuyuki Tahara316349.16
Akihiko Ohsuga428373.35