Title | ||
---|---|---|
Distributed L-diversity using spark-based algorithm for large resource description frameworks data |
Abstract | ||
---|---|---|
Privacy protection issues for resource description frameworks (RDFs) have emerged over the use of public government open data and the healthcare data of individuals. As these data may include personal information, they must undergo a de-identification process that deletes or replaces parts of the original data. To enable these protections, a method has been developed to apply k-anonymization to RDF data. However, sensitive RDF information anonymized using k-anonymization is not completely secure and is vulnerable to attacks. In this paper, we propose an l-diversity anatomy de-identification method that can overcome the limitations of k-anonymity and guarantee stronger privacy protection than k-anonymization. Further, as this data anonymization process is computationally time-intensive, we use Spark distributed computing to provide rapid de-identification to enhance its utility. We also propose l-diversity preservation for dynamically evolving RDF datasets. Experimental results show that our proposed distributed l-diversity algorithm processes the data more efficiently than conventional approaches. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1007/s11227-020-03583-6 | The Journal of Supercomputing |
Keywords | DocType | Volume |
Privacy protection, Resource description framework (RDF), De-identification, l-diversity, Anatomy algorithm, Spark | Journal | 77 |
Issue | ISSN | Citations |
7 | 0920-8542 | 0 |
PageRank | References | Authors |
0.34 | 2 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
MinHyuk Jeon | 1 | 0 | 0.34 |
Odsuren Temuujin | 2 | 0 | 0.34 |
Jinhyun Ahn | 3 | 0 | 0.34 |
Dong-Hyuk Im | 4 | 35 | 6.06 |