Title
Accurate keyphrase extraction by discriminating overlapping phrases
Abstract
In this paper we define the document phrase maximality index DPM-index, a new measure to discriminate overlapping keyphrase candidates in a text document. As an application we developed a supervised learning system that uses 18 statistical features, among them the DPM-index and five other new features. We experimentally compared our results with those of 21 keyphrase extraction methods on SemEval-2010/Task-5 scientific articles corpus. When all the systems extract 10 keyphrases per document, our method enhances by 13% the F-score of the best system. In particular, the DPM-index feature increases the F-score of our keyphrase extraction system by a rate of 9%. This makes the DPM-index contribution comparable to that of the well-known TFIDF measure on such a system.
Year
DOI
Venue
2014
10.1177/0165551514530210
Journal of Information Science
Keywords
Field
DocType
information extraction,keyphrase extraction,scientific digital libraries,text mining
Text mining,Information retrieval,Computer science,Phrase,Information extraction,Text document
Journal
Volume
Issue
ISSN
40
4
0165-5515
Citations 
PageRank 
References 
5
0.42
44
Authors
2
Name
Order
Citations
PageRank
Mounia Haddoud1181.66
Saïd Abdeddaïm250.76