Title
Performance of self-taught documents: exploiting co-relevance structure in a document collection
Abstract
In this paper we study the behavior of an information retrieval system in which index terms are assigned at random to both documents and requests. The random indexing is then modified by means of a feedback mechanism derived from a normal probability model and applied to both the request and document representations. Of interest is the convergence properties of the representation vectors. After few feedback iterations, it is found that well defined clusters form that accurately represent the corelevance structure among the documents—in effect the feedback mechanism has permitted the documents to index themselves. This approach offers an interesting way to extend the dimensionality of the indexing vocabulary. Both this application and a theoretical analysis of the impact of extending the indexing vocabulary are discussed.
Year
DOI
Venue
1986
10.1145/253168.253220
SIGIR
Keywords
Field
DocType
convergence property,feedback mechanism,feedback iteration,self-taught document,information retrieval system,clusters form,indexing vocabulary,index term,co-relevance structure,corelevance structure,document representation,document collection,random indexing,indexation,indexing terms
Convergence (routing),Data mining,Random indexing,Information retrieval,Computer science,Document clustering,Search engine indexing,Curse of dimensionality,Vector space model,Index term,Vocabulary
Conference
ISBN
Citations 
PageRank 
0-89791-187-3
4
8.93
References 
Authors
3
1
Name
Order
Citations
PageRank
Abraham Bookstein1710480.57