Title | ||
---|---|---|
Evaluating Text Summarization Systems with a Fair Baseline from Multiple Reference Summaries. |
Abstract | ||
---|---|---|
Text summarization is a challenging task. Maintaining linguistic quality, optimizing both compression and retention, all while avoiding redundancy and preserving the substance of a text is a difficult process. Equally difficult is the task of evaluating such summaries. Interestingly, a summary generated from the same document can be different when written by different humans (or by the same human at different times). Hence, there is no convenient, complete set of rules to test a machine generated summary. In this paper, we propose a methodology for evaluating extractive summaries. We argue that the overlap between two summaries should be compared against the average intersection size of two random generated baselines and propose ranking machine generated summaries based on the concept of closeness with respect to reference summaries. The key idea of our methodology is the use of weighted relatedness towards the reference summaries, normalized by the relatedness of reference summaries among themselves. Our approach suggests a relative scale, and is tolerant towards the length of the summary. |
Year | Venue | Field |
---|---|---|
2016 | ECIR | Data mining,Automatic summarization,Normalization (statistics),Ranking,Information retrieval,Computer science,Closeness,Baseline (configuration management),Redundancy (engineering) |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
9 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Fahmida Hamid | 1 | 1 | 0.69 |
David Haraburda | 2 | 9 | 1.49 |
Paul Tarau | 3 | 1529 | 113.14 |