Title
Scalable Recursive Top-Down Hierarchical Clustering Approach with Implicit Model Selection for Textual Data Sets
Abstract
Automatic generation of taxonomies can be useful for a wide area of applications. In our application scenario a topical hierarchy should be constructed reasonably fast from a large document collection to aid browsing of the data set. The hierarchy should also be used by the InfoSky projection algorithm to create an information landscape visualization suitable for explorative navigation of the data. We developed an algorithm that applies a scalable, recursive, top-down clustering approach to generate a dynamic concept hierarchy. The algorithm recursively applies a workflow consisting of preprocessing, clustering, cluster labeling and projection into 2D space. Besides presenting and discussing the benefits of combining hierarchy browsing with visual exploration, we also investigate the clustering results achieved on a real world data set.
Year
DOI
Venue
2010
10.1109/DEXA.2010.25
DEXA Workshops
Keywords
Field
DocType
topical hierarchy,real world data,dynamic concept hierarchy,automatic generation,implicit model selection,textual data sets,application scenario,clustering approach,scalable recursive top-down hierarchical,top-down clustering approach,infosky projection algorithm,algorithm recursively,clustering result,clustering algorithms,k means,encyclopedias,hierarchical clustering,text analysis,labeling,vector space model,top down,model selection,internet
Hierarchical clustering,Canopy clustering algorithm,Fuzzy clustering,Data mining,CURE data clustering algorithm,Data stream clustering,Correlation clustering,Computer science,Artificial intelligence,Constrained clustering,Cluster analysis,Machine learning
Conference
Citations 
PageRank 
References 
5
0.61
11
Authors
3
Name
Order
Citations
PageRank
Markus Muhr1745.53
Vedran Sabol230431.98
Michael Granitzer382280.14