Title
Topic-Attentive Encoder-Decoder with Pre-Trained Language Model for Keyphrase Generation
Abstract
Keyphrase annotation task aims to retrieve the most representative phrases that express the essential gist of documents. In reality, some phrases that best summarize documents are often absent from the original text, which motivates researchers to develop generation methods, being able to create phrases. Existing generation approaches usually adopt the encoder-decoder framework for sequence generation. However, the widely-used recurrent neural network might fail to capture long-range dependencies among items. In addition, intuitively, as keyphrases are likely to correlate with topical words, some methods propose to introduce topic models into keyphrase generation. But they hardly leverage the global information of topics. In view of this, we employ the Transformer architecture with the pre-trained BERT model as the encoder-decoder framework for keyphrase generation. BERT and Transformer are demonstrated to be effective for many text mining tasks. But they have not been extensively studied for keyphrase generation. Furthermore, we propose a topic attention mechanism to utilize the corpus-level topic information globally for keyphrase generation. Specifically, we propose BertTKG, a keyphrase generation method that uses a contextualized neural topic model for corpus-level topic representation learning, and then enhances the document representations learned by pre-trained language model for better keyphrase decoding. Extensive experiments conducted on three public datasets manifest the superiority of BertTKG.
Year
DOI
Venue
2021
10.1109/ICDM51629.2021.00200
2021 21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2021)
Keywords
DocType
ISSN
Keyphrase generation, Pre-trained BERT, Transformer, Topic attention, Neural topic model
Conference
1550-4786
Citations 
PageRank 
References 
0
0.34
0
Authors
5
Name
Order
Citations
PageRank
Cangqi Zhou100.34
Jinling Shang200.34
Jing Zhang300.34
Li Qian-Mu43314.78
Dianming Hu501.35