Title
Investigating Biases in Textual Entailment Datasets.
Abstract
The ability to understand logical relationships between sentences is an important task in language understanding. To aid in progress for this task, researchers have collected datasets for machine learning and evaluation of current systems. However, like in the crowdsourced Visual Question Answering (VQA) task, some biases in the data inevitably occur. In our experiments, we find that performing classification on just the hypotheses on the SNLI dataset yields an accuracy of 64%. We analyze the bias extent in the SNLI and the MultiNLI dataset, discuss its implication, and propose a simple method to reduce the biases in the datasets.
Year
Venue
DocType
2019
CoRR
Journal
Volume
Citations 
PageRank 
abs/1906.09635
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Shawn Tan102.37
Yikang Shen2356.62
Chin-Wei Huang385.18
Aaron C. Courville46671348.46