Exploratory class-imbalanced and non-identical data distribution in automatic keyphrase extraction - Citegraph

Paper Info

Title
Exploratory class-imbalanced and non-identical data distribution in automatic keyphrase extraction

Abstract
While supervised learning algorithms hold much promise for automatic keyphrase extraction, most of them presume that the samples are evenly distributed among different classes as well as drawn from an identical distribution, which, however, may not be the case in the real-world task of extracting keyphrases from documents. In this paper, we propose a novel supervised keyphrase extraction approach which deals with the problems of class-imbalanced and non-identical data distributions in automatic keyphrase extraction. Our approach is by nature a stacking approach where meta-models are trained on balanced partitions of a given training set and then combined through introducing meta-features describing particular keyphrase patterns embedded in each document. Experimental results verify the effectiveness of our approach.

Year	DOI	Venue
2012	10.1007/978-3-642-31362-2_38	ISNN (2)
Keywords	Field	DocType
exploratory class-imbalanced,keyphrase extraction approach,real-world task,supervised learning algorithm,different class,balanced partition,automatic keyphrase extraction,identical distribution,non-identical data distribution,particular keyphrase pattern	Training set,Pattern recognition,Computer science,Artificial intelligence,Supervised training,Machine learning	Conference
Citations	PageRank	References
0	0.34	20
Authors
3

Authors (3 rows)

Cited by (0 rows)

References (20 rows)

Name	Order	Citations	PageRank
Weijian Ni	1	14	8.09
Tong Liu	2	3	3.14
Qingtian Zeng	3	242	43.67

1