Title
A distribution free summarization method for Affymetrix GeneChip arrays.
Abstract
Affymetrix GeneChip arrays require summarization in order to combine the probe-level intensities into one value representing the expression level of a gene. However, probe intensity measurements are expected to be affected by different levels of non-specific- and cross-hybridization to non-specific transcripts. Here, we present a new summarization technique, the Distribution Free Weighted method (DFW), which uses information about the variability in probe behavior to estimate the extent of non-specific and cross-hybridization for each probe. The contribution of the probe is weighted accordingly during summarization, without making any distributional assumptions for the probe-level data.We compare DFW with several popular summarization methods on spike-in datasets, via both our own calculations and the 'Affycomp II' competition. The results show that DFW outperforms other methods when sensitivity and specificity are considered simultaneously. With the Affycomp spike-in datasets, the area under the receiver operating characteristic curve for DFW is nearly 1.0 (a perfect value), indicating that DFW can identify all differentially expressed genes with a few false positives. The approach used is also computationally faster than most other methods in current use.The R code for DFW is available upon request.Supplementary data are available at Bioinformatics online.
Year
DOI
Venue
2007
10.1093/bioinformatics/btl609
Bioinformatics
Keywords
Field
DocType
probe intensity measurement,perfect value,affymetrix genechip,probe-level data,affycomp spike-in datasets,new summarization technique,probe-level intensity,popular summarization method,distribution free summarization method,affycomp ii,probe behavior,supplementary information,false positive,receiver operating characteristic curve,smu
Data mining,Automatic summarization,Receiver operating characteristic,Computer science,Computer program,Gene chip analysis,Bioinformatics,DNA microarray,False positive paradox
Journal
Volume
Issue
ISSN
23
3
1367-4811
Citations 
PageRank 
References 
14
1.07
6
Authors
4
Name
Order
Citations
PageRank
Zhongxue Chen124415.77
Monnie McGee2202.61
Qingzhong Liu358844.77
Richard H. Scheuermann425823.91