Interpreting Deep Neural Networks through Prototype Factorization - Citegraph

Paper Info

Title
Interpreting Deep Neural Networks through Prototype Factorization

Abstract
Typical deep neural networks (DNNs) are complex black-box models and their decision making process can be difficult to comprehend even for experienced machine learning practitioners. Therefore their use could be limited in mission-critical scenarios despite state-of-the-art performance on many challenging ML tasks. Through this work, we empower users to interpret DNNs with a post-hoc analysis protocol. We propose ProtoFac, an explainable matrix factorization technique that decomposes the latent representations at any selected layer in a pre-trained DNN as a collection of weighted prototypes, which are a small number of exemplars extracted from the original data (e.g. image patches, shapelets). Using the factorized weights and prototypes we build a surrogate model for interpretation by replacing the corresponding layer in the neural network. We identify a number of desired properties of ProtoFac including authenticity, interpretability, simplicity and propose the optimization objective and training procedure accordingly. The method is model-agnostic and can be applied to DNNs with varying architectures. It goes beyond per-sample feature-based explanation by providing prototypes as a condensed set of evidences used by the model for decision making. We applied ProtoFac to interpret pretrained DNNs for a variety of ML tasks including time series classification on electrocardiograms, and image classification. The result shows that ProtoFac is able to extract meaningful prototypes to explain the models' decisions while truthfully reflects the models' operation. We also evaluated human interpretability through Amazon Mechanical Turk (MTurk), showing that ProtoFac is able to produce interpretable and user-friendly explanations.

Year	DOI	Venue
2020	10.1109/ICDMW51313.2020.00068	2020 International Conference on Data Mining Workshops (ICDMW)
Keywords	DocType	ISSN
Matrix Factorization,Explainable AI,Deep Neural Networks	Conference	2375-9232
ISBN	Citations	PageRank
978-1-7281-9013-6	0	0.34
References	Authors
0	5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Subhajit Das	1	13	6.22
Panpan Xu	2	227	11.73
Zeng Dai	3	1	1.36
Alex Endert	4	974	52.18
Liu Ren	5	141	10.51

1