Abstract | ||
---|---|---|
Keyphrase extraction from social media is a crucial and challenging task. Previous studies usually focus on extracting keyphrases that provide the summary of a corpus. However, they do not take users’ specific needs into consideration. In this paper, we propose a novel three-stage model to learn a keyphrase set that represents or related to a particular topic. Firstly, a phrase mining algorithm is applied to segment the documents into human-interpretable phrases. Secondly, we propose a weakly supervised model to extract candidate keyphrases, which uses a few pre-specific seed keyphrases to guide the model. The model consequently makes the extracted keyphrases more specific and related to the seed keyphrases (which reflect the user’s needs). Finally, to further identify the implicitly related phrases, the PMI-IR algorithm is employed to obtain the synonyms of the extracted candidate keyphrases. We conducted experiments on two publicly available datasets from news and Twitter. The experimental results demonstrate that our approach outperforms the state-of-the-art baselines and has the potential to extract high-quality task-oriented keyphrases. |
Year | DOI | Venue |
---|---|---|
2018 | https://doi.org/10.1007/s11042-017-5041-y | Multimedia Tools Appl. |
Keywords | Field | DocType |
Keyphrase extraction,Weakly supervised learning,Topic model | Social media,Information retrieval,Computer science,Phrase,Natural language processing,Artificial intelligence,Topic model,Data mining algorithm,Task oriented | Journal |
Volume | Issue | ISSN |
77 | 3 | 1380-7501 |
Citations | PageRank | References |
0 | 0.34 | 21 |
Authors | ||
6 |