Abstract | ||
---|---|---|
Clustering is the process of discovering groups within the data, based on similarities, with a minimal, if any, knowledge of their structure. The self-organizing (or Kohonen) map (SOM) is one of the best known neural network algorithms. It has been widely studied as a software tool for visualization of high-dimensional data. Important features include information compression while preserving topological and metric relationship of the primary,data items. Although Kohonen maps had been applied for clustering data, usually the researcher sets the number of neurons equal to the expected number of clusters, or manually segments a two-dimensional map using some a priori knowledge of the data. This paper proposes techniques for automatic partitioning and labeling SOM networks in clusters of neurons that may be used to represent the data clusters. Mathematical morphology operations, such as watershed, are performed on the U-matrix, which is a neuron-distance image. The direct application of watershed leads to an oversegmented image. It is used markers to identify significant clusters and homotopy modification to suppress the others. Markers are automatically found by performing a multi-level scan of connected regions of the U-matrix. Each cluster of neurons is a sub-graph that defines, in the input space, complex and nonparametric geometries which approximately describes the shape of the clusters. The process of map partitioning is extended recursively. Each cluster of neurons gives rise to a new map, which are trained with the subset of data that were classified to it. The algorithm produces dynamically a hierarchical tree of maps, which explains the cluster's structure in levels of granularity. The distributed and multiple prototypes cluster representation enables the discoveries of clusters even in the case when we have two or more non-separable pattern classes. |
Year | DOI | Venue |
---|---|---|
2001 | 10.1117/12.421088 | DATA MINING AND KNOWLEDGE DISCOVERY: THEORY, TOOLS AND TECHNOLOGY III |
Keywords | Field | DocType |
cluster analysis, data mining, self-organizing maps, watershed transform, knowledge discovery | Data mining,Cluster (physics),Data set,Complete-linkage clustering,Computer science,Mathematical morphology,Self-organizing map,Knowledge extraction,Artificial neural network,Cluster analysis | Conference |
Volume | ISSN | Citations |
4384 | 0277-786X | 15 |
PageRank | References | Authors |
0.85 | 10 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
José Alfredo F. Costa | 1 | 52 | 10.11 |
Márcio L. Netto | 2 | 45 | 5.66 |