Title
Gazetteer-Guided Keyphrase Generation from Research Papers
Abstract
The task of keyphrase generation aims to generate the key phrases that capture the primary content of a document. An external domain-specific gazetteer can assist in generating keyphrases that are literally absent in the document (i.e., do not match any contiguous sub-sequence of source text) but relevant to the content of the document. In this paper, we present a technique to integrate knowledge from a gazetteer in order to improve keyphrase generation from research papers. We also present a copy mechanism that helps our model to utilize the gazetteer vocabulary to deal with the out-of-vocabulary words in keyphrases. Since constructing and maintaining relevant high-quality gazetteer by hand is very expensive, we also propose a method for automatic construction of a gazetteer given the input document, by leveraging similar documents in the training corpus. The thus constructed gazetteer helps focus on corpus-level information carried by other similar documents. Although this external information is crucial, it is never considered in previous studies. Experiments on real world datasets of research papers demonstrate that our proposed approach improves the performance of the state-of-the-art keyphrase generation models.
Year
DOI
Venue
2021
10.1007/978-3-030-75762-5_52
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2021, PT I
Keywords
DocType
Volume
Gazetteer, Keyphrase generation, Encoder-decoder, Copy mechanism, Attention
Conference
12712
ISSN
Citations 
PageRank 
0302-9743
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
T. Y. S. S. Santosh112.05
Debarshi Kumar Sanyal23610.62
Plaban Kumar Bhowmick3208.62
Partha Pratim Das41813.94