Ontology-driven Conceptual Document Classification. - Citegraph

Paper Info

Title
Ontology-driven Conceptual Document Classification.

Abstract
Document classification based on the lexical-semantic network, wordnet, is presented. Two types of document classification in Serbian have been experimented with classification based on chosen concepts from Serbian WordNet (SWN) and proper names-based classification. Conceptual document classification criteria are constructed from hierarchies rooted in a set of chosen concepts (first case) or in hierarchies rooted in some of the proper names' hypemyms (second case). A classificator of the first type is trained and then tested on an indexed and already classified Ebart corpus of Serbian newspapers (476917 articles). Precision, recall and F-measure show that this type of classification is promising although incomplete due mainly to SWN incompleteness. In the context of proper names-based classification, a proper names ontology based on the SWN is presented in the paper. A distance based similarity measure is defined, based on Euclidean and Manhattan distances. Classification of a subset of Contemporary Serbian Language Corpus is presented.

Year	Venue	Keywords
2010	KDIR 2010: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND INFORMATION RETRIEVAL	Document classification,Wordnet,SWN,Ontology,Proper name
Field	DocType	Citations
Document classification,Ontology,Information retrieval,Computer science,Artificial intelligence,Machine learning	Conference	1
PageRank	References	Authors
0.40	0	2

Authors (2 rows)

Cited by (1 rows)

References (0 rows)

Name	Order	Citations	PageRank
Gordana Pavlovic-Lazetic	1	35	7.82
Jelena Graovac	2	4	1.80

1