Evaluations and Methods for Explanation through Robustness Analysis - Citegraph

Paper Info

Title
Evaluations and Methods for Explanation through Robustness Analysis

Abstract
Feature based explanations, that provide importance of each feature towards the model prediction, is arguably one of the most intuitive ways to explain a model. In this paper, we establish a novel set of evaluation criteria for such feature based explanations by robustness analysis. In contrast to existing evaluations which require us to specify some way to ``remove\u0027\u0027 features that could inevitably introduces biases and artifacts, we make use of the subtler notion of smaller adversarial perturbations. By optimizing towards our proposed evaluation criteria, we obtain new explanations that are loosely necessary and sufficient for a prediction. We further extend the explanation to extract the set of features that would move the current prediction to a target class by adopting targeted adversarial attack for the robustness analysis. Through experiments across multiple domains and a human study, we validate the usefulness of our evaluation criteria and our derived explanations.

Year	Venue	DocType
2021	ICLR	Conference
Citations	PageRank	References
0	0.34	0
Authors
7

Authors (7 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Cheng-Yu Hsieh	1	1	2.05
Chih-Kuan Yeh	2	19	3.34
Xuanqing Liu	3	78	7.99
Pradeep D. Ravikumar	4	2185	155.99
Kim Seungyeon	5	0	0.34
Sanjiv Kumar	6	2182	153.05
Cho-Jui Hsieh	7	5034	291.05

1