Abstract | ||
---|---|---|
ABSTRACTAutomaticA nswer spanE xtraction (AE) focuses on identifying key information from paragraphs that can be asked. It has been used to facilitate downstream question generation tasks or data augmentation for question answering. Current work of AE heavily relies on the annotated answer spans fromM achineR eadingC omprehension (MRC) datasets. However, these methods suffer from the partial annotation problem due to the annotation protocols of MRC tasks. To tackle this problem, we propose \mymethod, a S tructured Co ntext graph network with P ositive -unlabeled learning. \mymethod first represents the paragraph by constructing a graph with both syntactic and semantic edges, then adopts a unified pointer network for answer span identification. \mymethod narrows the discrenpency between AE and MRC by formulating AE as aP ositive-\textitu nlabeled (PU) learning problem, thus recovering more answer spans from paragraphs. To evaluate newly extracted spans without annotation, we also present an automatic metric from the perspective of question answering and text summarization, which correlates well with human judgments. Comprehensive experiments on both AE and downstream tasks demonstrate the effectiveness of our proposed framework. Our code is available at \urlhttps://github.com/iambabao/SCOPE. |
Year | DOI | Venue |
---|---|---|
2022 | 10.1145/3488560.3498399 | WSDM |
Keywords | DocType | Citations |
Information extraction, Positive-unlabeled learning | Conference | 0 |
PageRank | References | Authors |
0.34 | 0 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Qiaoben Bao | 1 | 0 | 0.68 |
Jiangjie Chen | 2 | 0 | 1.35 |
Linfang Liu | 3 | 0 | 0.34 |
Jingping Liu | 4 | 3 | 3.43 |
Jiaqing Liang | 5 | 37 | 9.59 |
Yanghua Xiao | 6 | 482 | 54.90 |