Title
On the Privacy of Federated Pipelines
Abstract
ABSTRACTFederated learning (FL) is becoming an increasingly popular machine learning paradigm in application scenarios where sensitive data available at various local sites cannot be shared due to privacy protection regulations. In FL, the sensitive data never leaves the local sites and only model parameters are shared with a global aggregator. Nonetheless, it has recently been shown that, under some circumstances, the private data can be reconstructed from the model parameters, which implies that data leakage can occur in FL. In this paper, we draw attention to another risk associated with FL: Even if federated algorithms are individually privacy-preserving, combining them into pipelines is not necessarily privacy-preserving. We provide a concrete example from genome-wide association studies, where the combination of federated principal component analysis and federated linear regression allows the aggregator to retrieve sensitive patient data by solving an instance of the multidimensional subset sum problem. This supports the increasing awareness in the field that, for FL to be truly privacy-preserving, measures have to be undertaken to protect against data leakage at the aggregator.
Year
DOI
Venue
2021
10.1145/3404835.3462996
Research and Development in Information Retrieval
Keywords
DocType
Citations 
Federated Learning, Privacy, Genome-Wide Association Studies, Multidimensional Subset Sum, Integer Linear Programming
Conference
0
PageRank 
References 
Authors
0.34
0
4
Name
Order
Citations
PageRank
Reza Nasirigerdeh100.68
Reihaneh Torkzadehmahani200.34
Jan Baumbach314822.11
David Blumenthal4246.26