Title
Improving the efficiency of multidimensional scaling in the analysis of high-dimensional data using singular value decomposition.
Abstract
Multidimensional scaling (MDS) is a well-known multivariate statistical analysis method used for dimensionality reduction and visualization of similarities and dissimilarities in multidimensional data. The advantage of MDS with respect to singular value decomposition (SVD) based methods such as principal component analysis is its superior fidelity in representing the distance between different instances specially for high-dimensional geometric objects. Here, we investigate the importance of the choice of initial conditions for MDS, and show that SVD is the best choice to initiate MDS. Furthermore, we demonstrate that the use of the first principal components of SVD to initiate the MDS algorithm is more efficient than an iteration through all the principal components. Adding stochasticity to the molecular dynamics simulations typically used for MDS of large datasets, contrary to previous suggestions, likewise does not increase accuracy. Finally, we introduce a k nearest neighbor method to analyze the local structure of the geometric objects and use it to control the quality of the dimensionality reduction.We demonstrate here the, to our knowledge, most efficient and accurate initialization strategy for MDS algorithms, reducing considerably computational load. SVD-based initialization renders MDS methodology much more useful in the analysis of high-dimensional data such as functional genomics datasets.
Year
DOI
Venue
2011
10.1093/bioinformatics/btr143
Bioinformatics
Keywords
Field
DocType
svd-based initialization,best choice,well-known multivariate statistical analysis,high-dimensional data,dimensionality reduction,mds algorithm,principal component,multidimensional scaling,functional genomics datasets,principal component analysis,accurate initialization strategy,mds methodology,singular value decomposition,high dimensional data
k-nearest neighbors algorithm,Singular value decomposition,Data mining,Clustering high-dimensional data,Dimensionality reduction,Multidimensional scaling,Computer science,Multidimensional analysis,Initialization,Principal component analysis
Journal
Volume
Issue
ISSN
27
10
1367-4811
Citations 
PageRank 
References 
5
0.48
6
Authors
5
Name
Order
Citations
PageRank
Christophe Bécavin191.18
Nicolas Tchitchek2151.22
Colette Mintsa-Eya350.48
Annick Lesne4417.12
Arndt Benecke5706.28