Abstract | ||
---|---|---|
Several recent results provide theoretical insights into the phenomena of adversarial examples. Existing results, however, are often limited due to a gap between the simplicity of the models studied and the complexity of those deployed in practice. In this work, we strike a better balance by considering a model that involves learning a representation while at the same time giving a precise generalization bound and a robustness certificate. We focus on the hypothesis class obtained by combining a sparsity-promoting encoder coupled with a linear classifier, and show an interesting interplay between the expressivity and stability of the (supervised) representation map and a notion of margin in the feature space. We bound the robust risk (to l(2)-bounded perturbations) of hypotheses parameterized by dictionaries that achieve a mild encoder gap on training data. Furthermore, we provide a robustness certificate for end-to-end classification. We demonstrate the applicability of our analysis by computing certified accuracy on real data, and compare with other alternatives for certified robustness. |
Year | Venue | DocType |
---|---|---|
2020 | ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS (NEURIPS 2020) | Conference |
Volume | ISSN | Citations |
33 | 1049-5258 | 0 |
PageRank | References | Authors |
0.34 | 0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jeremias Sulam | 1 | 1 | 1.72 |
Ramchandran Muthukumar | 2 | 0 | 0.34 |
R. Arora | 3 | 489 | 35.97 |