Title
Sample Size and Reproducibility of Gene Set Analysis
Abstract
Gene set analysis is widely used to gain insight from gene expression data. Achieving reproducible results is a fundamental part of any expression analysis. In this paper, we propose a systematic approach to study the effect of sample sizes on the reproducibility of the results of 10 gene set analysis methods. To do so, we quantity the concept of reproducibility and use real expression datasets of different sizes. Our findings suggest that, as a general pattern, the results of gene set analysis are more reproducible as sample size increases. However, the smallest sample size for achieving reproducible results are variable across gene set analysis methods. Moreover, for some methods, increasing sample size leads to an increase in the number of false positives.
Year
DOI
Venue
2018
10.1109/BIBM.2018.8621462
2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
Keywords
Field
DocType
enrichment analysis,gene expression,gene set analysis,sample size
Reproducibility,Pattern recognition,Computer science,Expression analysis,Artificial intelligence,Gene set analysis,Sample size determination,Machine learning,False positive paradox
Conference
ISSN
ISBN
Citations 
2156-1125
978-1-5386-5489-7
0
PageRank 
References 
Authors
0.34
0
4
Name
Order
Citations
PageRank
Farhad Maleki1114.12
Katie Ovens201.69
Ian McQuillan39724.72
Anthony J. Kusalik411319.69