Title
On Anonymizing Medical Microdata With Large-Scale Missing Values - A Case Study With The Faers Dataset
Abstract
As big data analysis becomes one of the main driving forces for productivity and economic growth, the concern of individual privacy disclosure increases as well, especially for applications accessing medical or health data that contain personal information. Most contemporary techniques for privacy preserving data publishing follow a simple assumption-the data of concern is complete, i.e., containing no missing values, which however is not the case in the real world. This paper presents our endeavors on inspecting the effect of missing values upon medical data privacy. In particular, we inspected the US FAERS dataset, a public dataset containing adverse drug events released by US FDA. Following the presumption of current anonymization paradigm-the data should contain no missing values, we investigated three intuitive strategies, including or excluding missing values or executing imputation, to anonymize the FAERS dataset. Our results demonstrate the awkwardness of these intuitive strategies in handling data with a massive amount of missing values. Accordingly, we propose a new strategy, consolidation, and the corresponding privacy protection model and anonymization algorithm. Experimental results show that our method can prevent privacy disclosure and sustain the data utility for ADR signal detection.
Year
DOI
Venue
2019
10.1109/EMBC.2019.8857025
2019 41ST ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC)
Field
DocType
Volume
Computer vision,Data modeling,Information retrieval,Computer science,Artificial intelligence,Data publishing,Personally identifiable information,Microdata (HTML),Imputation (statistics),Missing data,Information privacy,Big data
Conference
2019
ISSN
Citations 
PageRank 
1557-170X
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Mei-Hui Hsiao100.34
Wen-Yang Lin239935.72
Kuang-Yung Hsu300.34
Zih-Xun Shen400.34