Title
Matrix correlations for high-dimensional data: the modified RV-coefficient
Abstract
Motivation: Modern functional genomics generates high-dimensional datasets. It is often convenient to have a single simple number characterizing the relationship between pairs of such high-dimensional datasets in a comprehensive way. Matrix correlations are such numbers and are appealing since they can be interpreted in the same way as Pearson's correlations familiar to biologists. The high-dimensionality of functional genomics data is, however, problematic for existing matrix correlations. The motivation of this article is 2-fold: (i) we introduce the idea of matrix correlations to the bioinformatics community and (ii) we give an improvement of the most promising matrix correlation coefficient (the RV-coefficient) circumventing the problems of high-dimensional data. Results: The modified RV-coefficient can be used in high-dimensional data analysis studies as an easy measure of common information of two datasets. This is shown by theoretical arguments, simulations and applications to two real-life examples from functional genomics, i. e. a transcriptomics and metabolomics example.
Year
DOI
Venue
2009
10.1093/bioinformatics/btn634
BIOINFORMATICS
Field
DocType
Volume
Data mining,Correlation coefficient,Clustering high-dimensional data,MATLAB,Computer science,Matrix (mathematics),Functional genomics,Genomics,Correlation,Bioinformatics,RV coefficient
Journal
25
Issue
ISSN
Citations 
3
1367-4803
5
PageRank 
References 
Authors
1.26
1
5
Name
Order
Citations
PageRank
Age K Smilde117616.49
Henk A. L. Kiers216918.28
S. Bijlsma351.26
C. M. Rubingh451.26
M. J. Van Erk551.26