Title
Interpreting Predictions of NLP Models.
Abstract
Although neural NLP models are highly expressive and empirically successful, they also systematically fail in counterintuitive ways and are opaque in their decision-making process. This tutorial will provide a background on interpretation techniques, i.e., methods for explaining the predictions of NLP models. We will first situate example-specific interpretations in the context of other ways to understand models (e.g., probing, dataset analyses). Next, we will present a thorough study of example-specific interpretations, including saliency maps, input perturbations (e.g., LIME, input reduction), adversarial attacks, and influence functions. Alongside these descriptions, we will walk through source code that creates and visualizes interpretations for a diverse set of NLP tasks. Finally, we will discuss open problems in the field, e.g., evaluating, extending, and improving interpretation methods.
Year
DOI
Venue
2020
10.18653/V1/2020.EMNLP-TUTORIALS.3
EMNLP
DocType
Volume
Citations 
Conference
2020.emnlp-tutorials
0
PageRank 
References 
Authors
0.34
0
3
Name
Order
Citations
PageRank
Eric Wallace1187.45
Matthew Gardner270438.49
Sameer Singh3106071.63