Abstract | ||
---|---|---|
In safety-critical machine learning applications, it is crucial to defend models against adversarial attacks --- small modifications of the input that change the predictions. Besides rigorously studied $\ell_p$-bounded additive perturbations, semantic perturbations (e.g. rotation, translation) raise a serious concern on deploying ML systems in real-world. Therefore, it is important to provide provable guarantees for deep learning models against semantically meaningful input transformations. In this paper, we propose a new universal probabilistic certification approach based on Chernoff-Cramer bounds that can be used in general attack settings. We estimate the probability of a model to fail if the attack is sampled from a certain distribution. Our theoretical findings are supported by experimental results on different datasets. |
Year | Venue | Keywords |
---|---|---|
2022 | AAAI Conference on Artificial Intelligence | Machine Learning (ML) |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
0 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Mikhail Pautov | 1 | 0 | 0.34 |
Nurislam Tursynbek | 2 | 0 | 1.35 |
Munkhoeva, Marina | 3 | 4 | 2.43 |
Nikita Muravev | 4 | 0 | 0.34 |
Aleksandr Petiushko | 5 | 0 | 0.68 |
Ivan V. Oseledets | 6 | 306 | 41.96 |