Title
Using A Random Forest Proximity Measure For Variable Importance Stratification In Genotypic Data
Abstract
In this work we study variable-significance in classification using the Random Forest proximity matrix and local Importance matrix. We use the proximity matrix to group the samples across a number of clusters and use these clusters to stratify the importance of a variable. We apply this approach to a cardiovascular genotype dataset for sample classification based on coronary heart disease and we found a number of variations related with cardiovascular disease phenotypes. We also used a set of phenotypes related with this genotype data to match the obtained clusters with coronary heart diseases phenotypes.
Year
Venue
Keywords
2014
PROCEEDINGS IWBBIO 2014: INTERNATIONAL WORK-CONFERENCE ON BIOINFORMATICS AND BIOMEDICAL ENGINEERING, VOLS 1 AND 2
Random Forest, Proximity Measure, Feature Importance, Genetic Data Analysis, Machine Learning
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
2
5
Name
Order
Citations
PageRank
José A. Seoane1769.29
Ian N. M. Day2264.53
Colin Campbell356841.10
Juan P. Casas451.51
Tom R. Gaunt56110.36