Title
Gene capture prediction and overlap estimation in EST sequencing from one or multiple libraries
Abstract
BACKGROUND: In expressed sequence tag (EST) sequencing, we are often interested in how many genes we can capture in an EST sample of a targeted size. This information provides insights to sequencing efficiency in experimental design, as well as clues to the diversity of expressed genes in the tissue from which the library was constructed. RESULTS: We propose a compound Poisson process model that can accurately predict the gene capture in a future EST sample based on an initial EST sample. It also allows estimation of the number of expressed genes in one cDNA library or co-expressed in two cDNA libraries. The superior performance of the new prediction method over an existing approach is established by a simulation study. Our analysis of four Arabidopsis thaliana EST sets suggests that the number of expressed genes present in four different cDNA libraries of Arabidopsis thaliana varies from 9155 (root) to 12005 (silique). An observed fraction of co-expressed genes in two different EST sets as low as 25% can correspond to an actual overlap fraction greater than 65%. CONCLUSION: The proposed method provides a convenient tool for gene capture prediction and cDNA library property diagnosis in EST sequencing.
Year
DOI
Venue
2005
10.1186/1471-2105-6-300
BMC Bioinformatics
Keywords
Field
DocType
computer simulation,expressed sequence tags,gene library,gene expression profiling,poisson distribution,expressed sequence tag,sequence alignment,compound poisson process,algorithms,cdna library,experimental design
Sequence alignment,Gene,Expressed sequence tag,Biology,Genomic library,Nonparametric maximum likelihood,Bioinformatics,Genetics,DNA microarray,Gene expression profiling,Arabidopsis Proteins
Journal
Volume
Issue
ISSN
6
1
14712105
Citations 
PageRank 
References 
2
0.57
6
Authors
7
Name
Order
Citations
PageRank
Ji-ping Z. Wang1121.59
Bruce G. Lindsay21173638.97
Liying Cui3444.28
P. Kerr Wall4334.07
Josh Marion520.57
Jiaxuan Zhang620.57
Claude W. Depamphilis7565.55