Title
Eigengene-based linear discriminant model for tumor classification using gene expression microarray data.
Abstract
The nearest shrunken centroids classifier has become a popular algorithm in tumor classification problems using gene expression microarray data. Feature selection is an embedded part of the method to select top-ranking genes based on a univariate distance statistic calculated for each gene individually. The univariate statistics summarize gene expression profiles outside of the gene co-regulation network context, leading to redundant information being included in the selection procedure.We propose an Eigengene-based Linear Discriminant Analysis (ELDA) to address gene selection in a multivariate framework. The algorithm uses a modified rotated Spectral Decomposition (SpD) technique to select 'hub' genes that associate with the most important eigenvectors. Using three benchmark cancer microarray datasets, we show that ELDA selects the most characteristic genes, leading to substantially smaller classifiers than the univariate feature selection based analogues. The resulting de-correlated expression profiles make the gene-wise independence assumption more realistic and applicable for the shrunken centroids classifier and other diagonal linear discriminant type of models. Our algorithm further incorporates a misclassification cost matrix, allowing differential penalization of one type of error over another. In the breast cancer data, we show false negative prognosis can be controlled via a cost-adjusted discriminant function.R code for the ELDA algorithm is available from author upon request.
Year
DOI
Venue
2006
10.1093/bioinformatics/btl442
Bioinformatics
Keywords
Field
DocType
gene co-regulation network context,top-ranking gene,characteristic gene,popular algorithm,gene expression profile,eigengene-based linear discriminant model,tumor classification,feature selection,gene expression microarray data,gene selection,elda algorithm,selection procedure,gene expression,discriminative model,breast cancer,discriminant function,spectral decomposition,eigenvectors,microarray data
Data mining,Pattern recognition,Feature selection,Multivariate statistics,Computer science,Matrix decomposition,Artificial intelligence,Linear discriminant analysis,Classifier (linguistics),Univariate,Statistical assumption,Discriminant function analysis
Journal
Volume
Issue
ISSN
22
21
1367-4811
Citations 
PageRank 
References 
14
1.12
6
Authors
4
Name
Order
Citations
PageRank
Ronglai Shen11266.83
Debashis Ghosh249649.16
Arul Chinnaiyan3141.12
Zhaoling Meng4141.46