Title | ||
---|---|---|
Performance of self-taught documents: exploiting co-relevance structure in a document collection |
Abstract | ||
---|---|---|
In this paper we study the behavior of an information retrieval system in which index terms are assigned at random to both documents and requests. The random indexing is then modified by means of a feedback mechanism derived from a normal probability model and applied to both the request and document representations. Of interest is the convergence properties of the representation vectors. After few feedback iterations, it is found that well defined clusters form that accurately represent the corelevance structure among the documents—in effect the feedback mechanism has permitted the documents to index themselves. This approach offers an interesting way to extend the dimensionality of the indexing vocabulary. Both this application and a theoretical analysis of the impact of extending the indexing vocabulary are discussed. |
Year | DOI | Venue |
---|---|---|
1986 | 10.1145/253168.253220 | SIGIR |
Keywords | Field | DocType |
convergence property,feedback mechanism,feedback iteration,self-taught document,information retrieval system,clusters form,indexing vocabulary,index term,co-relevance structure,corelevance structure,document representation,document collection,random indexing,indexation,indexing terms | Convergence (routing),Data mining,Random indexing,Information retrieval,Computer science,Document clustering,Search engine indexing,Curse of dimensionality,Vector space model,Index term,Vocabulary | Conference |
ISBN | Citations | PageRank |
0-89791-187-3 | 4 | 8.93 |
References | Authors | |
3 | 1 |
Name | Order | Citations | PageRank |
---|---|---|---|
Abraham Bookstein | 1 | 710 | 480.57 |