Title
Defending Against Neural Network Model Stealing Attacks Using Deceptive Perturbations
Abstract
Machine learning architectures are readily available, but obtaining the high quality labeled data for training is costly. Pre-trained models available as cloud services can be used to generate this costly labeled data, and would allow an attacker to replicate trained models, effectively stealing them. Limiting the information provided by cloud based models by omitting class probabilities has been proposed as a means of protection but significantly impacts the utility of the models. In this work, we illustrate how cloud based models can still provide useful class probability information for users, while significantly limiting the ability of an adversary to steal the model. Our defense perturbs the model's final activation layer, slightly altering the output probabilities. This forces the adversary to discard the class probabilities, requiring significantly more queries before they can train a model with comparable performance. We evaluate our defense under diverse scenarios and defense aware attacks. Our evaluation shows our defense can degrade the accuracy of the stolen model at least 20%, or increase the number of queries required by an adversary 64 fold, all with a negligible decrease in the protected model accuracy.
Year
DOI
Venue
2019
10.1109/SPW.2019.00020
2019 IEEE Security and Privacy Workshops (SPW)
Keywords
Field
DocType
neural network,machine learning,model stealing,model theft,security,threat
Computer security,Computer science,Computer network,Labeled data,Adversary,Artificial neural network,Replicate,Limiting,Cloud computing
Conference
ISBN
Citations 
PageRank 
978-1-7281-3509-0
2
0.35
References 
Authors
3
4
Name
Order
Citations
PageRank
Taesung Lee1213.85
Benjamin Edwards2193.77
Ian Molloy373338.81
Dong Su4123.98