Title
Intelligent Information Retrieval System Using Automatic Thesaurus Construction
Abstract
This paper presents an intelligent information retrieval (IR) system based on automatic thesaurus construction for its applications of document clustering and classification. These two applications are the most influential and widely used fields amongst the IR research community. We apply two biologically inspired algorithms, i.e. genetic algorithm (GA) and neural network (NN), to these two fields. A fuzzy logic controller GA and an adaptive back-propagation NN are proposed in our study, which can validly overcome the problems existing in their archetypes, e.g. slow evolution and being prone to trap into a local optimum. Furthermore, a well-constructed thesaurus has been recognised as a valuable tool in the effective operation of clustering and classification. It solves the problem in document representation organised by a bag of words, where some important relationships between words, e.g. synonymy and polysemy, are ignored. To investigate how our IR system could be used effectively, we conduct experiments on four data sets from the benchmark Reuter-21578 document collection and 20-newsgroup corpus. The results reveal that our IR system enhances the performance in comparison with k-means, common GA, and conventional back-propagation NN.
Year
DOI
Venue
2011
10.1080/03081079.2010.530026
INTERNATIONAL JOURNAL OF GENERAL SYSTEMS
Keywords
DocType
Volume
information retrieval, clustering, classification, thesaurus, neural network, genetic algorithm
Journal
40
Issue
ISSN
Citations 
4
0308-1079
0
PageRank 
References 
Authors
0.34
36
4
Name
Order
Citations
PageRank
Wei Song111315.51
Ju Cheng Yang219717.05
Cheng Hua Li3111.63
Soon Cheol Park419714.78