Title
Adversarial Risk and the Dangers of Evaluating Against Weak Attacks.
Abstract
This paper investigates recently proposed approaches for defending against adversarial examples and evaluating adversarial robustness. The existence of adversarial examples in trained neural networks reflects the fact that expected risk alone does not capture the modelu0027s performance against worst-case inputs. We motivate the use of adversarial risk as an objective, although it cannot easily be computed exactly. We then frame commonly used attacks and evaluation metrics as defining a tractable surrogate objective to the true adversarial risk. This suggests that models may be obscured to adversaries, by optimizing this surrogate rather than the true adversarial risk. We demonstrate that this is a significant problem in practice by repurposing gradient-free optimization techniques into adversarial attacks, which we use to decrease the accuracy of several recently proposed defenses to near zero. Our hope is that our formulations and results will help researchers to develop more powerful defenses.
Year
Venue
DocType
2018
ICML
Conference
Volume
Citations 
PageRank 
abs/1802.05666
18
0.74
References 
Authors
30
4
Name
Order
Citations
PageRank
Jonathan Uesato1856.60
Brendan O'Donoghue217210.19
Aäron Van Den Oord3158564.43
Pushmeet Kohli47398332.84