Title | ||
---|---|---|
Using A Random Forest Proximity Measure For Variable Importance Stratification In Genotypic Data |
Abstract | ||
---|---|---|
In this work we study variable-significance in classification using the Random Forest proximity matrix and local Importance matrix. We use the proximity matrix to group the samples across a number of clusters and use these clusters to stratify the importance of a variable. We apply this approach to a cardiovascular genotype dataset for sample classification based on coronary heart disease and we found a number of variations related with cardiovascular disease phenotypes. We also used a set of phenotypes related with this genotype data to match the obtained clusters with coronary heart diseases phenotypes. |
Year | Venue | Keywords |
---|---|---|
2014 | PROCEEDINGS IWBBIO 2014: INTERNATIONAL WORK-CONFERENCE ON BIOINFORMATICS AND BIOMEDICAL ENGINEERING, VOLS 1 AND 2 | Random Forest, Proximity Measure, Feature Importance, Genetic Data Analysis, Machine Learning |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
2 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
José A. Seoane | 1 | 76 | 9.29 |
Ian N. M. Day | 2 | 26 | 4.53 |
Colin Campbell | 3 | 568 | 41.10 |
Juan P. Casas | 4 | 5 | 1.51 |
Tom R. Gaunt | 5 | 61 | 10.36 |