Cited text spans identification with an improved balanced ensemble model. - Citegraph

Paper Info

Title
Cited text spans identification with an improved balanced ensemble model.

Abstract
Scientific summarization aims to provide condensed summary of important contributions of scientific papers. This problem has been extensively explored and recent interest has been aroused to taking advantage of the cited text spans to generate summaries. Cited text spans are the texts in the cited paper that most accurately reflect the citation. They can be viewed as important aspects of the cited paper which are annotated by academic community. Hence, identifying cited text spans is of vital importance for providing a different scientific summarization. In this paper, we explore three potential improvements towards our previous work which is a two-layer ensemble model to tackle the cited text spans identification problem. We first view cited text spans identification as an imbalanced classification problem and carry out comparison on preprocessing methods to handle the imbalanced dataset. Then we propose RANdom Sampling Aggregating (RANSA) algorithm to train classifiers in the first ensemble layer model. Finally, an improved stacking framework Hybrid-Stacking is applied to combine the models of the first layer. Our new ensemble model overcomes flaws of the previous work, and shows improved performance on cited text spans identification.

Year	DOI	Venue
2019	10.1007/s11192-019-03167-z	Scientometrics
Keywords	Field	DocType
Scientific summarization, Cited text spans, Ensemble, Stacking	Automatic summarization,Data mining,Ensemble forecasting,Information retrieval,Computer science,Citation,Preprocessor,Sampling (statistics),Academic community,Parameter identification problem	Journal
Volume	Issue	ISSN
120	3	0138-9130
Citations	PageRank	References
0	0.34	0
Authors
5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Pancheng Wang	1	1	1.71
Shasha Li	2	85	20.31
Haifang Zhou	3	35	9.33
Jintao Tang	4	89	14.00
Ting Wang	5	36	9.43

1