Title
Procedure for stability analysis of gene selection from cross-site gene expression data
Abstract
Typically, thousands of gene expression levels are recorded for a group of patients, leading to the situation where the number of features far exceeds the number of examples. To combat this, researchers would want to combine gene expression data collected at different sites into one data set to reduce the magnitude of the difference between the number of features (genes) and examples (samples). This makes gene selection a critical component of any process to build models using gene expression data. For instance, in the domain of ordering cancer patients based on survival time, one might assume that utilizing genes related to cancer development and progression will allow the best model to be built. In this paper, we explore two different gene selection techniques and examine how well the genes selected compare between methods. We also check gene set consistency between data sets collected using the same protocols at different research institutions. It is shown that gene selection can result in very different sets given different training data.
Year
DOI
Venue
2011
10.1109/ICSMC.2011.6083767
Systems, Man, and Cybernetics
Keywords
Field
DocType
cancer,data handling,genetics,medical computing,cancer patients,cross site gene expression data,gene selection,stability analysis,survival time,training data,gene expression data,gene selection,signal-to-noise,stability
Training set,Data mining,Gene selection,Data set,Gene,Computer science,Gene expression,Group method of data handling
Conference
ISSN
ISBN
Citations 
1062-922X
978-1-4577-0652-3
0
PageRank 
References 
Authors
0.34
4
4
Name
Order
Citations
PageRank
John N. Korecki171.13
Lawrence O. Hall25543335.87
Dmitry B. Goldgof32021198.90
Steven Eschrich48910.81