Hierarchical interpretations for neural network predictions. - Citegraph

Paper Info

Title
Hierarchical interpretations for neural network predictions.

Abstract
Deep neural networks (DNNs) have achieved impressive predictive performance due to their ability to learn complex, non-linear relationships between variables. However, the inability to effectively visualize these relationships has led to DNNs being characterized as black boxes and consequently limited their applications. To ameliorate this problem, we introduce the use of hierarchical interpretations to explain DNN predictions through our proposed method, agglomerative contextual decomposition (ACD). Given a prediction from a trained DNN, ACD produces a hierarchical clustering of the input features, along with the contribution of each cluster to the final prediction. This hierarchy is optimized to identify clusters of features that the DNN learned are predictive. Using examples from Stanford Sentiment Treebank and ImageNet, we show that ACD is effective at diagnosing incorrect predictions and identifying dataset bias. Through human experiments, we demonstrate that ACD enables users both to identify the more accurate of two DNNs and to better trust a DNNu0027s outputs. We also find that ACDu0027s hierarchy is largely robust to adversarial perturbations, implying that it captures fundamental aspects of the input and ignores spurious noise.

Year	Venue	Field
2018	international conference on learning representations	Hierarchical clustering,Cluster (physics),Artificial intelligence,Treebank,Black box,Artificial neural network,Hierarchy,Spurious relationship,Machine learning,Deep neural networks,Mathematics
DocType	Volume	ISSN
Journal	abs/1806.05337	ICLR 2019
Citations	PageRank	References
2	0.36	29
Authors
3

Authors (3 rows)

Cited by (2 rows)

References (29 rows)

Name	Order	Citations	PageRank
Chandan Singh	1	26	4.57
W. James Murdoch	2	32	2.61
Bin Yu	3	1984	241.03

1