Abstract | ||
---|---|---|
Methods of neural network attribution have emerged out of a necessity for explanation and accountability in the predictions of black-box neural models. Most approaches use a variation of sensitivity analysis, where individual input variables are perturbed and the downstream effects on some output metric are measured. We demonstrate that a number of critical functional properties are not revealed when only considering lower-order perturbations. Motivated by these shortcomings, we propose a general framework for decomposing the orders of influence that a collection of input variables has on an output classification. These orders are based on the cardinality of input subsets which are perturbed to yield a change in classification. This decomposition can be naturally applied to attribute which input variables rely on higher-order coordination to impact the classification decision. We demonstrate that our approach correctly identifies higher-order attribution on a number of synthetic examples. Additionally, we showcase the differences between attribution in our approach and existing approaches on benchmark networks for MNIST and ImageNet. |
Year | Venue | DocType |
---|---|---|
2021 | 24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS) | Conference |
Volume | ISSN | Citations |
130 | 2640-3498 | 0 |
PageRank | References | Authors |
0.34 | 0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Kyle Reing | 1 | 5 | 1.91 |
Greg Ver Steeg | 2 | 243 | 32.99 |
Aram Galstyan | 3 | 1033 | 94.05 |