Abstract | ||
---|---|---|
We propose a system that extracts the most relevant figures and tables from a set of topically related source documents. These are then integrated into the extractive text summary produced using the same set. The proposed method is domain independent. It predominantly focuses on the generation of a ranked list of relevant candidate units (figures/tables), in order of their computed relevancy. The relevancy measure is based on local and global scores that include direct and indirect references. In order to test the system performance, we have created a test collection of document sets which do not adhere to any specific domain. Evaluation experiments show that the system generated ranked list is in statistically significant correlation with the human evaluators' ranking judgments. Feasibility of the proposed system to summarize a document set which contains figures/tables as their salient units is made clear in our concluding remark. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1007/978-3-642-28601-8_34 | CICLing (2) |
Keywords | Field | DocType |
document set,test collection,system performance,specific domain,multi-document summarization,relevancy measure,computed relevancy,proposed system,relevant candidate unit,relevant figure,multi document summarization | Multi-document summarization,Information retrieval,Ranking,Computer science,Artificial intelligence,Natural language processing,Source document,Salient | Conference |
Citations | PageRank | References |
1 | 0.35 | 12 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ashish Sadh | 1 | 1 | 0.68 |
Amit Sahu | 2 | 1 | 0.35 |
Devesh Srivastava | 3 | 1 | 0.35 |
Ratna Sanyal | 4 | 39 | 5.74 |
Sudip Sanyal | 5 | 102 | 11.22 |