Abstract | ||
---|---|---|
Latent semantic indexing (LSI) is an information retrieval technique based on the spectralanalysis of the term-document matrix, whose empirical success had heretofore been withoutrigorous prediction and explanation. We prove that, under certain conditions, LSI does succeedin capturing the underlying semantics of the corpus and achieves improved retrieval performance.We also propose the technique of random projection as a way of speeding up LSI. We complementour theorems with... |
Year | DOI | Venue |
---|---|---|
2000 | 10.1006/jcss.2000.1711 | Journal of Computer and System Sciences - Special issue on the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems |
Keywords | Field | DocType |
collaborative filtering,probabilistic analysis,information retrieval,spectral method,latent semantic indexing | Latent semantic indexing,Latent Dirichlet allocation,Information retrieval,Computer science,Probabilistic analysis of algorithms,Document-term matrix,Probabilistic latent semantic analysis | Journal |
Volume | Issue | ISSN |
61 | 2 | 0022-0000 |
Citations | PageRank | References |
234 | 95.83 | 10 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Christos H. Papadimitriou | 1 | 16671 | 3192.54 |
Prabhakar Raghavan | 2 | 13351 | 2776.61 |
Prabhakar Raghavan | 3 | 13351 | 2776.61 |
Santosh Vempala | 4 | 3546 | 523.21 |