Title
Use Of A Domain-Specific Ontology To Support Automated Document Categorization At The Concept Level: Method Development And Evaluation
Abstract
Voluminous, conveniently accessible textual documents, created and disseminated by modern information technology, makes automated document organization increasingly important for both individuals and organizations. Many existing techniques rely on document content analysis that classifies new, unlabeled documents by examining the similarity based on the overlap between their important features and the representative features of each document category. However, the performance of feature-based techniques can be significantly hindered by word mismatch and ambiguity problems. As a remedy, this study takes a concept-based approach and propose a text categorization method that incorporates a domain-specific ontology to support automated document categorization more effectively. The proposed method classifies documents according to their respective range of relevant concepts. We empirically evaluate our method versus several prevalent benchmarks that include feature-based k-nearest neighbors (kNN) and semantic-based techniques. The results show the proposed method more effective than the benchmark techniques; it achieves better performances when using a complete concept hierarchy without considering the hierarchical relationships among concepts. The proposed method illustrates how to incorporate a domain-specific ontology to improve document classification. Our method is computationally efficient because it produces a concept space of relatively few dimensionalities and does not require semantic space reconstruction as new documents arrive. Moreover, the relationships and patterns for classifying documents, generated by our method, are explicit and comprehensible.
Year
DOI
Venue
2021
10.1016/j.eswa.2021.114681
EXPERT SYSTEMS WITH APPLICATIONS
Keywords
DocType
Volume
Automated document categorization, Ontology-based text categorization, Knowledge management, Domain-specific ontology, K-nearest neighbors
Journal
174
ISSN
Citations 
PageRank 
0957-4174
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Yen-hsien Lee111816.64
Paul Jen-Hwa Hu200.34
Wan-Jung Tsao300.34
Liang Li400.34