Title
Efficient interpretable variants of online SOM for large dissimilarity data.
Abstract
Self-organizing maps (SOM) are a useful tool for exploring data. In its original version, the SOM algorithm was designed for numerical vectors. Since then, several extensions have been proposed to handle complex datasets described by (dis)similarities. Most of these extensions represent prototypes by a list of (dis)similarities with the entire dataset and suffer from several drawbacks: their complexity is increased - it becomes quadratic instead of linear -, the stability is reduced and the interpretability of the prototypes is lost.In the present article, we propose and compare two extensions of the stochastic SOM for (dis)similarity data: the first one takes advantage of the online setting in order to maintain a sparse representation of the prototypes at each step of the algorithm, while the second one uses a dimension reduction in a feature space defined by the (dis)similarity. Our contributions to the analysis of (dis)similarity data with topographic maps are thus twofolds: first, we present a new version of the SOM algorithm which ensures a sparse representation of the prototypes through online updates. Second, this approach is compared on several benchmarks to a standard dimension reduction technique (K-PCA), which is itself adapted to large datasets with the Nyström approximation.Results demonstrate that both approaches lead to reduce the prototypes dimensionality while providing accurate results in a reasonable computational time. Selecting one of these two strategies depends on the dataset size, the need to easily interpret the results and the computational facilities available. The conclusion tries to provide some recommendations to help the user making this choice.
Year
DOI
Venue
2017
10.1016/j.neucom.2016.11.014
Neurocomputing
Keywords
Field
DocType
SOM,Sparse methods,Kernel,Dissimilarity,K-PCA,Nyström
Data mining,Dimensionality reduction,Computer science,Artificial intelligence,Kernel (linear algebra),Interpretability,Feature vector,Pattern recognition,Topographic map,Sparse approximation,Quadratic equation,Curse of dimensionality,Machine learning
Journal
Volume
Issue
ISSN
225
C
0925-2312
Citations 
PageRank 
References 
0
0.34
34
Authors
3
Name
Order
Citations
PageRank
Jérôme Mariette192.18
Madalina Olteanu26810.50
Nathalie Villa-Vialaneix37210.94