Adversarial Examples for Evaluating Reading Comprehension Systems. - Citegraph

Paper Info

Title
Adversarial Examples for Evaluating Reading Comprehension Systems.

Abstract
Standard accuracy metrics indicate that reading comprehension systems are making rapid progress, but the extent to which these systems truly understand language remains unclear. To reward systems with real language understanding abilities, we propose an adversarial evaluation scheme for the Stanford Question Answering Dataset (SQuAD). Our method tests whether systems can answer questions about paragraphs that contain adversarially inserted sentences, which are automatically generated to distract computer systems without changing the correct answer or misleading humans. In this adversarial setting, the accuracy of sixteen published models drops from an average of $75%$ F1 score to $36%$; when the adversary is allowed to add ungrammatical sequences of words, average accuracy on four models decreases further to $7%$. We hope our insights will motivate the development of new models that understand language more precisely.

Year	DOI	Venue
2017	10.18653/v1/D17-1215	EMNLP
DocType	Volume	Citations
Conference	abs/1707.07328	114
PageRank	References	Authors
2.87	16	2

Search Limit

100114

Authors (2 rows)

Cited by (100 rows)

References (16 rows)

Name	Order	Citations	PageRank
Robin Jia	1	227	12.53
Percy Liang	2	3416	172.27

1