Title
Almost-constant-time clustering of arbitrary corpus subsets4
Abstract
Methods exist for constant-time clustering of corpus subsets selected via Scatter/Gather browsing [3]. In this paper we expand on those techniques, giving an algorithm for almost-constant-time clustering of arbitrary corpus subsets. This algorithm is never slower than clustering the document set from scratch, and for medium-sized and large sets it is significantly faster. This algorithm is useful for clustering arbitrary subsets of large corpora - obtained, for instance, by a boolean search - quickly enough to be useful in an interactive setting.
Year
DOI
Venue
1997
10.1145/258525.258535
SIGIR
Field
DocType
Volume
Data mining,Correlation clustering,Computer science,Cluster analysis
Conference
31
Issue
ISSN
ISBN
SI
0163-5840
0-89791-836-3
Citations 
PageRank 
References 
29
10.28
4
Authors
2
Name
Order
Citations
PageRank
Craig Silverstein12910.28
Jan O. Pedersen263011177.07