Title
Interpreting Deep Neural Networks through Prototype Factorization
Abstract
Typical deep neural networks (DNNs) are complex black-box models and their decision making process can be difficult to comprehend even for experienced machine learning practitioners. Therefore their use could be limited in mission-critical scenarios despite state-of-the-art performance on many challenging ML tasks. Through this work, we empower users to interpret DNNs with a post-hoc analysis protocol. We propose ProtoFac, an explainable matrix factorization technique that decomposes the latent representations at any selected layer in a pre-trained DNN as a collection of weighted prototypes, which are a small number of exemplars extracted from the original data (e.g. image patches, shapelets). Using the factorized weights and prototypes we build a surrogate model for interpretation by replacing the corresponding layer in the neural network. We identify a number of desired properties of ProtoFac including authenticity, interpretability, simplicity and propose the optimization objective and training procedure accordingly. The method is model-agnostic and can be applied to DNNs with varying architectures. It goes beyond per-sample feature-based explanation by providing prototypes as a condensed set of evidences used by the model for decision making. We applied ProtoFac to interpret pretrained DNNs for a variety of ML tasks including time series classification on electrocardiograms, and image classification. The result shows that ProtoFac is able to extract meaningful prototypes to explain the models' decisions while truthfully reflects the models' operation. We also evaluated human interpretability through Amazon Mechanical Turk (MTurk), showing that ProtoFac is able to produce interpretable and user-friendly explanations.
Year
DOI
Venue
2020
10.1109/ICDMW51313.2020.00068
2020 International Conference on Data Mining Workshops (ICDMW)
Keywords
DocType
ISSN
Matrix Factorization,Explainable AI,Deep Neural Networks
Conference
2375-9232
ISBN
Citations 
PageRank 
978-1-7281-9013-6
0
0.34
References 
Authors
0
5
Name
Order
Citations
PageRank
Subhajit Das1136.22
Panpan Xu222711.73
Zeng Dai311.36
Alex Endert497452.18
Liu Ren514110.51