Title
Wcd-New Approach Combining Words, Concepts And Documents Based On Ontology
Abstract
In traditional Information Retrieval (IR) system, the document is represented by the set of words or terms. If the words or terms are regarded as the components of a vector, the model is called the vector space model (VSM). VSM has been widely used in IR systems in recently decades. As the the new words appear dramatically in the Internet era, the amount of computation is very large and it draws back the IR system's performance. This paper puts forward a new approach according to the relations among the words, concepts and the document by using the concept of the ontology. The new approach has two levels, the Word-Concept (WC) level and the Concept-Document (CD) level. In the WC level, the transition probability matrix is constructed by using the word-word pairs appeared in the same paragraph, and the biggest eigenvector of matrix is computed. The eigenvector reflects the importance of the word to the concept. In the CD level, the distance matrix is constructed by using the distance between words in the concept, and the average variance values of elements is computed. The value determines the relevance of the document to the concept. In order to expand the query sentence, the Personal Information Profile (PIP) of the user is defined by using the query history of the user. It is proofed to be more effective than previous one.
Year
Venue
Keywords
2012
COMPUTATIONAL INTELLIGENCE AND INTELLIGENT SYSTEMS
Ontology, Word-Concept level, Concept-Document level, Personal Information Profile
Field
DocType
Volume
Data mining,Ontology,Stochastic matrix,Information retrieval,Matrix (mathematics),Computer science,Paragraph,Distance matrix,Vector space model,Sentence,Eigenvalues and eigenvectors
Conference
316
ISSN
Citations 
PageRank 
1865-0929
0
0.34
References 
Authors
10
3
Name
Order
Citations
PageRank
Hao-ming Wang100.34
Ye Guo211.02
Xibing Shi321.19