Title
Derivation of minimum best sample size from microarray data sets: A Monte Carlo approach
Abstract
NCBI has been accumulating a large repository of microarray data sets, namely Gene Expression Omnibus (GEO). GEO is a great resource enabling one to pursue various biological and pathological questions. The question we ask here is: given a set of gene signatures and a classifier, what is the best minimum sample size in a clinical microarray research that can effectively distinguish different types of patient responses to a therapeutic drug. It is difficult to answer the question since the sample size for most microarray experiments stored in GEO is very limited. This paper presents a Monte Carlo approach to simulating the best minimum microarray sample size based on the available data sets. Support Vector Machine (SVM) is used as a classifier to compute prediction accuracy for different sample size. Then, a logistic function is applied to fit the relationship between sample size and accuracy whereby a theoretic minimum sample size can be derived.
Year
DOI
Venue
2011
10.1109/CIBCB.2011.5948461
CIBCB
Keywords
Field
DocType
monte carlo approach,therapeutic drug,genetics,logistic function,pattern classification,gene expression omnibus,ncbi,geo,biology computing,microarray data sets,support vector machine,monte carlo methods,minimum best sample size,support vector machines,accuracy,monte carlo,mathematical model,logistics,testing,microarray data,sample size
Data mining,Data set,Computer science,Microarray analysis techniques,Artificial intelligence,Classifier (linguistics),Monte Carlo method,Ask price,Support vector machine,Bioinformatics,Logistic function,Machine learning,Sample size determination
Conference
ISBN
Citations 
PageRank 
978-1-4244-9896-3
1
0.36
References 
Authors
5
3
Name
Order
Citations
PageRank
Chengpeng Bi113111.29
Mara Becker291.23
J Steven Leeder3383.24