Title | ||
---|---|---|
Eigengene-based linear discriminant model for tumor classification using gene expression microarray data. |
Abstract | ||
---|---|---|
The nearest shrunken centroids classifier has become a popular algorithm in tumor classification problems using gene expression microarray data. Feature selection is an embedded part of the method to select top-ranking genes based on a univariate distance statistic calculated for each gene individually. The univariate statistics summarize gene expression profiles outside of the gene co-regulation network context, leading to redundant information being included in the selection procedure.We propose an Eigengene-based Linear Discriminant Analysis (ELDA) to address gene selection in a multivariate framework. The algorithm uses a modified rotated Spectral Decomposition (SpD) technique to select 'hub' genes that associate with the most important eigenvectors. Using three benchmark cancer microarray datasets, we show that ELDA selects the most characteristic genes, leading to substantially smaller classifiers than the univariate feature selection based analogues. The resulting de-correlated expression profiles make the gene-wise independence assumption more realistic and applicable for the shrunken centroids classifier and other diagonal linear discriminant type of models. Our algorithm further incorporates a misclassification cost matrix, allowing differential penalization of one type of error over another. In the breast cancer data, we show false negative prognosis can be controlled via a cost-adjusted discriminant function.R code for the ELDA algorithm is available from author upon request. |
Year | DOI | Venue |
---|---|---|
2006 | 10.1093/bioinformatics/btl442 | Bioinformatics |
Keywords | Field | DocType |
gene co-regulation network context,top-ranking gene,characteristic gene,popular algorithm,gene expression profile,eigengene-based linear discriminant model,tumor classification,feature selection,gene expression microarray data,gene selection,elda algorithm,selection procedure,gene expression,discriminative model,breast cancer,discriminant function,spectral decomposition,eigenvectors,microarray data | Data mining,Pattern recognition,Feature selection,Multivariate statistics,Computer science,Matrix decomposition,Artificial intelligence,Linear discriminant analysis,Classifier (linguistics),Univariate,Statistical assumption,Discriminant function analysis | Journal |
Volume | Issue | ISSN |
22 | 21 | 1367-4811 |
Citations | PageRank | References |
14 | 1.12 | 6 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ronglai Shen | 1 | 126 | 6.83 |
Debashis Ghosh | 2 | 496 | 49.16 |
Arul Chinnaiyan | 3 | 14 | 1.12 |
Zhaoling Meng | 4 | 14 | 1.46 |