Title
Probing Prior Knowledge Needed in Challenging Chinese Machine Reading Comprehension.
Abstract
With an ultimate goal of narrowing the gap between human and machine readers in text comprehension, we present the first collection of Challenging Chinese machine reading Comprehension datasets (C^3) collected from language and professional certification exams, which contains 13,924 documents and their associated 23,990 multiple-choice questions. Most of the questions in C^3 cannot be answered merely by surface-form matching against the given text. As a pilot study, we closely analyze the prior knowledge (i.e., linguistic, domain-specific, and general world knowledge) needed in these real-world reading comprehension tasks. We further explore how to leverage linguistic knowledge including a lexicon of idioms and proverbs, graphs of general world knowledge (e.g., ConceptNet), and domain-specific knowledge such as textbooks to aid machine readers, through fine-tuning a pre-trained language model. Experimental results demonstrate that linguistic and general world knowledge may help improve the performance of the baseline reader in both general and domain-specific tasks. C^3 will be available at this http URL.
Year
Venue
DocType
2019
arXiv: Computation and Language
Journal
Volume
Citations 
PageRank 
abs/1904.09679
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Kai Sun1337.71
Dian Yu26411.49
Dong Yu36264475.73
Claire Cardie45591555.20