Title
Identifying Personal DNA Methylation Profiles by Genotype Inference
Abstract
Since the first whole-genome sequencing, the biomedical research community has made significant steps towards a more precise, predictive and personalized medicine. Genomic data is nowadays widely considered privacy-sensitive and consequently protected by strict regulations and released only after careful consideration. Various additional types of biomedical data, however, are not shielded by any dedicated legal means and consequently disseminated much less thoughtfully. This in particular holds true for DNA methylation data as one of the most important and well-understood epigenetic element influencing human health. In this paper, we show that, in contrast to the aforementioned belief, releasing one's DNA methylation data causes privacy issues akin to releasing one's actual genome. We show that already a small subset of methylation regions influenced by genomic variants are sufficient to infer parts of someone's genome, and to further map this DNA methylation profile to the corresponding genome. Notably, we show that such re-identification is possible with 97.5% accuracy, relying on a dataset of more than 2500 genomes, and that we can reject all wrongly matched genomes using an appropriate statistical test. We provide means for countering this threat by proposing a novel cryptographic scheme for privately classifying tumors that enables a privacy-respecting medical diagnosis in a common clinical setting. The scheme relies on a combination of random forests and homomorphic encryption, and it is proven secure in the honest-but-curious model. We evaluate this scheme on real DNA methylation data, and show that we can keep the computational overhead to acceptable values for our application scenario.
Year
DOI
Venue
2017
10.1109/SP.2017.21
2017 IEEE Symposium on Security and Privacy (SP)
Keywords
Field
DocType
personal DNA methylation profiles,genotype inference,whole-genome sequencing,biomedical research community,predictive medicine,personalized medicine,genomic data,privacy-sensitive data,DNA methylation data,epigenetic element,human health,cryptographic scheme
Genome,Overhead (computing),Data mining,Computer science,Computer security,Inference,DNA methylation,Genomics,Computational biology,Information privacy,Epigenetics,Personalized medicine
Conference
ISSN
ISBN
Citations 
1081-6011
978-1-5090-5534-0
5
PageRank 
References 
Authors
0.41
15
7
Name
Order
Citations
PageRank
Michael Backes12801163.28
Pascal Berrang2335.16
Matthias Bieg350.41
Roland Eils464470.09
Carl Herrmann512910.38
Mathias Humbert619416.18
Irina Lehmann781.78