AutoAttacker: A reinforcement learning approach for black-box adversarial attacks - Citegraph

Paper Info

Title
AutoAttacker: A reinforcement learning approach for black-box adversarial attacks

Abstract
Recent research has shown that machine learning models are susceptible to adversarial examples, allowing attackers to trick a machine learning model into making a mistake and producing an incorrect output. Adversarial examples are commonly constructed or discovered by using gradient-based methods that require white-box access to the model. In most real-world AI system deployments, having complete access to the machine learning model is an unrealistic threat model. However, it is possible for an attacker to construct adversarial examples even in the black-box case - where we assume solely a query capability to the model - with a variety of approaches each with its advantages and shortcomings. We introduce AutoAttacker, a novel reinforcement learning framework where agents learn how to operate around the black-box model by querying it, to effectively extract the underlying decision behaviour, and to undermine it successfully. AutoAttacker is a first of kind framework that uses reinforcement learning and assumes nothing about the differentiability or structure of the underlying function and is thus robust towards common defenses like gradient obfuscation or adversarial training. Finally, without differentiable output, as in binary classification, most methods cease to operate and require either an approximation of the gradient, or another approach altogether. Our approach, however, maintains the capability to function when the output descriptiveness diminishes.

Year	DOI	Venue
2019	10.1109/EuroSPW.2019.00032	2019 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW)
Keywords	Field	DocType
reinforcement-learning,black-box-attack,adversarial-machine-learning	Black box (phreaking),Binary classification,Mistake,Threat model,Computer science,Theoretical computer science,Adversarial machine learning,Artificial intelligence,Obfuscation,Adversarial system,Reinforcement learning	Conference
ISBN	Citations	PageRank
978-1-7281-3027-9	0	0.34
References	Authors
0	3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Ilias Tsingenopoulos	1	0	0.34
Davy Preuveneers	2	705	65.56
Wouter Joosen	3	2898	287.70

1