Title
A Concept-Driven Automatic Ontology Generation Approach for Conceptualization of Document Corpora
Abstract
In the age of increasing information availability, many techniques, such as document clustering and information visualization, have been developed to ease understanding of information for users. However, most of these methods do not help users directly understand key concepts and their semantic relationships in document corpora, which are critical for capturing their conceptual structures. Therefore, we propose a novel approach called 'Clonto' to identify the key concepts and automatically generate ontologies based on these concepts for conceptualization of document corpora. Clonto applies latent semantic analysis to identify key concepts, allocates documents based on these concepts, and utilizes WordNet to automatically generate a corpus-related ontology. The documents are linked to the ontology through the key concepts. The experimental results show that Clonto can identify key concepts with a high precision and the clustering results of Clonto outperform the STC (Suffix Tree Clustering) algorithm, the Lingo clustering algorithm, the Fuzzy Ants clustering algorithm, and clustering based on TRS (Tolerance Rough Set). Moreover, based on the same document corpus, the ontology generated by Clonto shows a significant informative conceptual structure.
Year
DOI
Venue
2008
10.1109/WIIAT.2008.233
Web Intelligence
Keywords
Field
DocType
clustering result,key concept,ontology,informative conceptual structure,allocates document,lingo,wordnet,document corpora,information retrieval,key concept identification,latent semantic analysis,conceptual structure,suffix tree clustering,tolerance rought set,lingo clustering algorithm,concept-driven automatic ontology generation,document clustering,information availability,information visualization,document handling,corpus-related ontology,document corpus,clonto,document corpora conceptualization,visualization,rough set,clustering algorithms,data mining,matrix decomposition,ontologies
Ontology,Data mining,Computer science,Document clustering,Artificial intelligence,Natural language processing,WordNet,Cluster analysis,Ontology (information science),Information retrieval,Suffix tree clustering,Conceptualization,Latent semantic analysis
Conference
Volume
ISBN
Citations 
1
978-0-7695-3496-1
4
PageRank 
References 
Authors
0.53
11
3
Name
Order
Citations
PageRank
Zheng Hai-Tao114224.39
Borchert Charles2322.78
Hong-Gee Kim310418.80