Title
Integrating a Lexical Database and a Training Collection for Text Categorization
Abstract
Automatic text categorization is a complex and useful task for many natural language processing applications. Recent approaches to text categorization focus more on algorithms than on resources involved in this operation. In contrast to this trend, we present an approach based on the integration of widely available resources as lexical databases and training collections to overcome current limitations of the task. Our approach makes use of WordNet synonymy information to increase evidence for bad trained categories. When testing a direct categorization, a WordNet based one, a training algorithm, and our integrated approach, the latter exhibits a better perfomance than any of the others. Incidentally, WordNet based approach perfomance is comparable with the training approach one.
Year
Venue
Keywords
1997
Clinical Orthopaedics and Related Research
natural language processing
Field
DocType
Volume
Categorization,Information retrieval,Computer science,Lexical database,Natural language processing,Artificial intelligence,Language identification,WordNet,Text categorization
Journal
cmp-lg/970
Citations 
PageRank 
References 
7
0.99
9
Authors
2
Name
Order
Citations
PageRank
José María Gómez Hidalgo122524.70
Manuel De Buenaga Rodríguez26716.59