Abstract | ||
---|---|---|
Information nowadays has become more and more accessible, so much as to give birth to an information overload issue. Yet important decisions have to be made, depending on the available information. As it is impossible to read all the relevant content that helps one stay informed, a possible solution would be condensing data and obtaining the kernel of a text by automatically summarizing it. We present an approach to analyzing text and retrieving valuable information in the form of a semantic graph based on subject-verb-object triplets extracted from sentences. Once triplets have been generated, we apply several techniques in order to obtain the semantic graph of the document: co-reference and anaphora resolution of named entities and semantic normalization of triplets. Finally, we describe the automatic document summarization process starting from the semantic representation of the text. The experimental evaluation carried out step by step on several Reuters newswire articles shows a comparable performance of the proposed approach with other existing methodologies. For the assessment of the document summaries we utilize an automatic summarization evaluation package, so as to show a ranking of various summarizers. |
Year | Venue | Keywords |
---|---|---|
2009 | INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS | natural language processing, text mining, semantic graph, document summarization |
Field | DocType | Volume |
Multi-document summarization,Automatic summarization,Graph,Information retrieval,Computer science,Automation,Document summarization,Artificial intelligence,Natural language processing,Fortuna | Journal | 33 |
Issue | ISSN | Citations |
3 | 0350-5596 | 17 |
PageRank | References | Authors |
1.21 | 3 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Delia Rusu | 1 | 40 | 5.61 |
blaž fortuna | 2 | 232 | 20.61 |
Marko Grobelnik | 3 | 1032 | 126.90 |
Dunja Mladenic | 4 | 1484 | 170.14 |