Abstract | ||
---|---|---|
Keyphrase extraction plays an important role in automatic document understanding. In order to obtain concise and comprehensive information about the content of document, the keyphrases extracted from a given document should meet two requirements. First, the keyphrases should be diverse to each other so as to avoid carrying duplicated information. Second, every keyphrases should cover various aspects of the topics in the document so as to avoid unnecessary information loss. In this paper, we address the issue of automatic keyphrases extraction, giving the emphasis on the diversity and coverage of keyphrases which is generally ignored in most conventional keyphrase extraction approaches. Specifically, the issue is formulated as a subset learning problem in the framework of structural learning and structural SVM is employed to preform the task. Experiments on a scientific literature dataset show that our approach outperforms several state-of-the-art keyphrase extraction approaches, which verifies the benefits of explicit diversity and coverage enhancement. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1007/978-3-642-29253-8_11 | APWeb |
Keywords | Field | DocType |
state-of-the-art keyphrase extraction approach,structural svm,automatic document understanding,unnecessary information loss,comprehensive information,keyphrase extraction,conventional keyphrase extraction approach,automatic keyphrases extraction,extracting keyphrase,coverage enhancement,high diversity,explicit diversity | Scientific literature,Hill climbing,Data mining,Information loss,Information retrieval,Computer science,Structural learning,Keyword extraction,Support vector machine | Conference |
Citations | PageRank | References |
0 | 0.34 | 21 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Weijian Ni | 1 | 14 | 8.09 |
Tong Liu | 2 | 3 | 3.14 |
Qingtian Zeng | 3 | 242 | 43.67 |