Title | ||
---|---|---|
Hybrid approach for visualization of documents clusters using GHSOM and sammon projection |
Abstract | ||
---|---|---|
This paper presents the hybrid approach for visualization of documents sets by the combination of hierarchical clustering method, based on the Growing Hierarchical Self-Organizing Maps algorithm, and Sammon projection. Algorithms based on the self-organizing maps provide robust clustering method suitable for visualization of larger number of documents into the grid-based 2D maps. Sammon projection is nonlinear projection method suitable mostly to visualization of smaller sets of object on (usually 2D) maps based on the projections. Here we have implemented and tested combination of these approaches, where starting set of documents is organized using GHSOM to subsets of similar documents, then for clusters at the end of clustering phase, with smaller number of inputs, Sammon maps are created in order to provide distinction also for documents in these clusters. The method for extraction of characteristic terms based on the information gain analysis was used for description of clusters. Existing library JBOWL was used for implementation of the hybrid algorithm. For testing purposes, the documents in English language were used. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1109/SACI.2013.6608994 | Applied Computational Intelligence and Informatics |
Keywords | Field | DocType |
data visualisation,document handling,information analysis,natural language processing,pattern clustering,self-organising feature maps,English language,GHSOM,Sammon projection,clustering phase,documents cluster visualization,grid-based 2D maps,growing hierarchical self-organizing maps algorithm,hierarchical clustering method,hybrid algorithm,information gain analysis,library JBOWL,nonlinear projection method,robust clustering method | Hierarchical clustering,Sammon mapping,Data mining,Canopy clustering algorithm,Fuzzy clustering,CURE data clustering algorithm,Pattern recognition,Correlation clustering,Computer science,Artificial intelligence,Cluster analysis,Brown clustering | Conference |
ISBN | Citations | PageRank |
978-1-4673-6397-6 | 1 | 0.35 |
References | Authors | |
8 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Peter Butka | 1 | 41 | 8.44 |
Jana Pócsová | 2 | 33 | 6.02 |