Abstract | ||
---|---|---|
Typical deep neural networks (DNNs) are complex black-box models and their decision making process can be difficult to comprehend even for experienced machine learning practitioners. Therefore their use could be limited in mission-critical scenarios despite state-of-the-art performance on many challenging ML tasks. Through this work, we empower users to interpret DNNs with a post-hoc analysis protocol. We propose ProtoFac, an explainable matrix factorization technique that decomposes the latent representations at any selected layer in a pre-trained DNN as a collection of weighted prototypes, which are a small number of exemplars extracted from the original data (e.g. image patches, shapelets). Using the factorized weights and prototypes we build a surrogate model for interpretation by replacing the corresponding layer in the neural network. We identify a number of desired properties of ProtoFac including authenticity, interpretability, simplicity and propose the optimization objective and training procedure accordingly. The method is model-agnostic and can be applied to DNNs with varying architectures. It goes beyond per-sample feature-based explanation by providing prototypes as a condensed set of evidences used by the model for decision making. We applied ProtoFac to interpret pretrained DNNs for a variety of ML tasks including time series classification on electrocardiograms, and image classification. The result shows that ProtoFac is able to extract meaningful prototypes to explain the models' decisions while truthfully reflects the models' operation. We also evaluated human interpretability through Amazon Mechanical Turk (MTurk), showing that ProtoFac is able to produce interpretable and user-friendly explanations. |
Year | DOI | Venue |
---|---|---|
2020 | 10.1109/ICDMW51313.2020.00068 | 2020 International Conference on Data Mining Workshops (ICDMW) |
Keywords | DocType | ISSN |
Matrix Factorization,Explainable AI,Deep Neural Networks | Conference | 2375-9232 |
ISBN | Citations | PageRank |
978-1-7281-9013-6 | 0 | 0.34 |
References | Authors | |
0 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Subhajit Das | 1 | 13 | 6.22 |
Panpan Xu | 2 | 227 | 11.73 |
Zeng Dai | 3 | 1 | 1.36 |
Alex Endert | 4 | 974 | 52.18 |
Liu Ren | 5 | 141 | 10.51 |