Title
A Log Likelihood Predictor for Genomic Classification of Oral Cancer using Principle Component Analysis for Feature Selection
Abstract
DNA microarrays are powerful tools for exploring gene expres- sion and predicting disease state. However, since the number of variables (genes) typically exceeds the number of samples (tis- sue specimens), many potentially spurious genes may be selected for a predictor function. Principle component analysis (PCA) can greatly reduce the high-dimensional microarray data space while retaining most of the inherent variability. We propose a methodology that uses PCA to identify a predictor vector be- tween two mutually exclusive and collectively exhaustive class- es. By projecting the training set upon this vector a distribution of projections can be computed for each class. A log-likelihood ratio is then calculated for class membership. We used this meth- odology to classify 48 biopsy specimens as either oral squamous cell carcinoma or normal oral mucosa using oligonucleotide mi- croarrays. The system was trained using a set of half the sam- ples, and correctly predicted the membership of the other half. The three most highly positively and three most highly negative predictive genes were all keratins that are known markers of squamous cell carcinoma.
Year
Venue
Keywords
2004
MedInfo
squamous cell.,principal compo- nent analysis,oligonucleotide array sequence analysis,carcinoma,principle component analysis,microarray data,log likelihood ratio,feature selection,dna microarray
Field
DocType
Citations 
Data mining,Gene,Feature selection,Microarray analysis techniques,Computational biology,Statistics,Spurious relationship,Medicine,Cancer,DNA microarray,Principal component analysis,Carcinoma
Conference
0
PageRank 
References 
Authors
0.34
1
9
Name
Order
Citations
PageRank
Mark E. Whipplea100.34
D. Gregory Farwella200.34
S. Nicholas351.16
S. Nicholas451.16
Chu Chena500.34
Mark Whipple6186.70
Eduardo Mendez750.82
D. Gregory Farwell800.68
Chu Chen9101.90