Title
Stealing Your Data from Compressed Machine Learning Models
Abstract
Machine learning models have been widely deployed in many real-world tasks. When a non-expert data holder wants to use a third-party machine learning service for model training, it is critical to preserve the confidentiality of the training data. In this paper, we for the first time explore the potential privacy leakage in a scenario that a malicious ML provider offers data holder customized training code including model compression which is essential in practical deployment The provider is unable to access the training process hosted by the secured third party, but could inquire models when they are released in public. As a result, adversary can extract sensitive training data with high quality even from these deeply compressed models that are tailored for resource-limited devices. Our investigation shows that existing compressions like quantization, can serve as a defense against such an attack, by degrading the model accuracy and memorized data quality simultaneously. To overcome this defense, we take an initial attempt to design a simple but stealthy quantized correlation encoding attack flow from an adversary perspective. Three integrated components-data pre-processing, layer-wise data-weight correlation regularization, data-aware quantization, are developed accordingly. Extensive experimental results show that our framework can preserve the evasiveness and effectiveness of stealing data from compressed models.
Year
DOI
Venue
2020
10.1109/DAC18072.2020.9218633
2020 57th ACM/IEEE Design Automation Conference (DAC)
DocType
ISSN
ISBN
Conference
0738-100X
978-1-7281-1085-1
Citations 
PageRank 
References 
0
0.34
0
Authors
6
Name
Order
Citations
PageRank
Nuo Xu1147.66
Qi Liu2173.67
Tao Liu3457.40
Zihao Liu4345.45
Xiaochen Guo502.03
Wujie Wen630030.61