Title
Can You Fake It Until You Make It? - Impacts of Differentially Private Synthetic Data on Downstream Classification Fairness.
Abstract
The recent adoption of machine learning models in high-risk settings such as medicine has increased demand for developments in privacy and fairness. Rebalancing skewed datasets using synthetic data created by generative adversarial networks (GANs) has shown potential to mitigate disparate impact on minoritized subgroups. However, such generative models are subject to privacy attacks that can expose sensitive data from the training dataset. Differential privacy (DP) is the current leading solution for privacy-preserving machine learning. Differentially private GANs (DP GANs) are often considered a potential solution for improving model fairness while maintaining privacy of sensitive training data. We investigate the impact of using synthetic images from DP GANs on downstream classification model utility and fairness. We demonstrate that existing DP GANs cannot simultaneously maintain model utility, privacy, and fairness. The images generated from GAN models trained with DP exhibit extreme decreases in image quality and utility which leads to poor downstream classification model performance. Our evaluation highlights the friction between privacy, fairness, and utility and how this directly translates into real loss of performance and representation in common machine learning settings. Our results show that additional work improving the utility and fairness of DP generative models is required before they can be utilized as a potential solution to privacy and fairness issues stemming from lack of diversity in the training dataset.
Year
DOI
Venue
2021
10.1145/3442188.3445879
FAccT
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
0
5
Name
Order
Citations
PageRank
Victoria Cheng100.34
Vinith M. Suriyakumar200.34
Natalie Dullerud300.34
Shalmali Joshi411.02
Marzyeh Ghassemi511520.11