Scatter/Gather: a cluster-based approach to browsing large document collections - Citegraph

Paper Info

Title
Scatter/Gather: a cluster-based approach to browsing large document collections

Abstract
Document clustering has not been well received as an information retrieval tool. Objections to its use fall into two main categories: first, that clustering is too slow for large corpora (with running time often quadratic in the number of documents); and second, that clustering does not appreciably improve retrieval.We argue that these problems arise only when clustering is used in an attempt to improve conventional search techniques. However, looking at clustering as an information access tool in its own right obviates these objections, and provides a powerful new access paradigm. We present a document browsing technique that employs document clustering as its primary operation. We also present fast (linear time) clustering algorithms which support this interactive browsing paradigm.

Year	Venue	Keywords
1992	SIGIR Forum	document clustering,large document collection,information retrieval tool,main category,conventional search technique,linear time,information access tool,clustering algorithm,large corpus,interactive browsing paradigm,powerful new access paradigm,cluster-based approach,information retrieval
Field	DocType	Volume
Data mining,Fuzzy clustering,Canopy clustering algorithm,CURE data clustering algorithm,Data stream clustering,Information retrieval,Document clustering,Computer science,Conceptual clustering,Brown clustering,Cluster analysis	Conference	51
Issue	Citations	PageRank
2	774	217.80
References	Authors
11	4

Search Limit

100774

Authors (4 rows)

Cited by (100 rows)

References (11 rows)

Name	Order	Citations	PageRank
Douglas R. Cutting	1	1030	423.10
David R. Karger	2	19367	2233.64
Jan O. Pedersen	3	6301	1177.07
J. W. Tukey	4	1250	346.60

1