Harvesting More Answer Spans from Paragraph beyond Annotation - Citegraph

Paper Info

Title
Harvesting More Answer Spans from Paragraph beyond Annotation

Abstract
ABSTRACTAutomaticA nswer spanE xtraction (AE) focuses on identifying key information from paragraphs that can be asked. It has been used to facilitate downstream question generation tasks or data augmentation for question answering. Current work of AE heavily relies on the annotated answer spans fromM achineR eadingC omprehension (MRC) datasets. However, these methods suffer from the partial annotation problem due to the annotation protocols of MRC tasks. To tackle this problem, we propose \mymethod, a S tructured Co ntext graph network with P ositive -unlabeled learning. \mymethod first represents the paragraph by constructing a graph with both syntactic and semantic edges, then adopts a unified pointer network for answer span identification. \mymethod narrows the discrenpency between AE and MRC by formulating AE as aP ositive-\textitu nlabeled (PU) learning problem, thus recovering more answer spans from paragraphs. To evaluate newly extracted spans without annotation, we also present an automatic metric from the perspective of question answering and text summarization, which correlates well with human judgments. Comprehensive experiments on both AE and downstream tasks demonstrate the effectiveness of our proposed framework. Our code is available at \urlhttps://github.com/iambabao/SCOPE.

Year	DOI	Venue
2022	10.1145/3488560.3498399	WSDM
Keywords	DocType	Citations
Information extraction, Positive-unlabeled learning	Conference	0
PageRank	References	Authors
0.34	0	6

Authors (6 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Qiaoben Bao	1	0	0.68
Jiangjie Chen	2	0	1.35
Linfang Liu	3	0	0.34
Jingping Liu	4	3	3.43
Jiaqing Liang	5	37	9.59
Yanghua Xiao	6	482	54.90

1