Title | ||
---|---|---|
Evaluating language understanding accuracy with respect to objective outcomes in a dialogue system |
Abstract | ||
---|---|---|
It is not always clear how the differences in intrinsic evaluation metrics for a parser or classifier will affect the performance of the system that uses it. We investigate the relationship between the intrinsic evaluation scores of an interpretation component in a tutorial dialogue system and the learning outcomes in an experiment with human users. Following the PARADISE methodology, we use multiple linear regression to build predictive models of learning gain, an important objective outcome metric in tutorial dialogue. We show that standard intrinsic metrics such as F-score alone do not predict the outcomes well. However, we can build predictive performance functions that account for up to 50% of the variance in learning gain by combining features based on standard evaluation scores and on the confusion matrix entries. We argue that building such predictive models can help us better evaluate performance of NLP components that cannot be distinguished based on F-score alone, and illustrate our approach by comparing the current interpretation component in the system to a new classifier trained on the evaluation data. |
Year | Venue | Keywords |
---|---|---|
2012 | EACL | intrinsic evaluation score,language understanding accuracy,predictive model,tutorial dialogue system,nlp component,evaluation data,predictive performance function,intrinsic evaluation metrics,objective outcome,standard evaluation score,current interpretation component,standard intrinsic metrics |
Field | DocType | Citations |
Confusion matrix,Learning gain,Computer science,Natural language processing,Artificial intelligence,Parsing,Classifier (linguistics),Machine learning,Language understanding,Instrumental and intrinsic value,Linear regression | Conference | 3 |
PageRank | References | Authors |
0.44 | 13 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Myroslava O. Dzikovska | 1 | 360 | 35.49 |
Peter Bell | 2 | 192 | 22.97 |
Amy Isard | 3 | 335 | 63.31 |
Johanna D. Moore | 4 | 2152 | 443.80 |