Title
Scatter/Gather: a cluster-based approach to browsing large document collections
Abstract
Document clustering has not been well received as an information retrieval tool. Objections to its use fall into two main categories: first, that clustering is too slow for large corpora (with running time often quadratic in the number of documents); and second, that clustering does not appreciably improve retrieval.We argue that these problems arise only when clustering is used in an attempt to improve conventional search techniques. However, looking at clustering as an information access tool in its own right obviates these objections, and provides a powerful new access paradigm. We present a document browsing technique that employs document clustering as its primary operation. We also present fast (linear time) clustering algorithms which support this interactive browsing paradigm.
Year
Venue
Keywords
1992
SIGIR Forum
document clustering,large document collection,information retrieval tool,main category,conventional search technique,linear time,information access tool,clustering algorithm,large corpus,interactive browsing paradigm,powerful new access paradigm,cluster-based approach,information retrieval
Field
DocType
Volume
Data mining,Fuzzy clustering,Canopy clustering algorithm,CURE data clustering algorithm,Data stream clustering,Information retrieval,Document clustering,Computer science,Conceptual clustering,Brown clustering,Cluster analysis
Conference
51
Issue
Citations 
PageRank 
2
774
217.80
References 
Authors
11
4
Search Limit
100774
Name
Order
Citations
PageRank
Douglas R. Cutting11030423.10
David R. Karger2193672233.64
Jan O. Pedersen363011177.07
J. W. Tukey41250346.60