Abstract | ||
---|---|---|
Gene set analysis is widely used to gain insight from gene expression data. Achieving reproducible results is a fundamental part of any expression analysis. In this paper, we propose a systematic approach to study the effect of sample sizes on the reproducibility of the results of 10 gene set analysis methods. To do so, we quantity the concept of reproducibility and use real expression datasets of different sizes. Our findings suggest that, as a general pattern, the results of gene set analysis are more reproducible as sample size increases. However, the smallest sample size for achieving reproducible results are variable across gene set analysis methods. Moreover, for some methods, increasing sample size leads to an increase in the number of false positives. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1109/BIBM.2018.8621462 | 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) |
Keywords | Field | DocType |
enrichment analysis,gene expression,gene set analysis,sample size | Reproducibility,Pattern recognition,Computer science,Expression analysis,Artificial intelligence,Gene set analysis,Sample size determination,Machine learning,False positive paradox | Conference |
ISSN | ISBN | Citations |
2156-1125 | 978-1-5386-5489-7 | 0 |
PageRank | References | Authors |
0.34 | 0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Farhad Maleki | 1 | 11 | 4.12 |
Katie Ovens | 2 | 0 | 1.69 |
Ian McQuillan | 3 | 97 | 24.72 |
Anthony J. Kusalik | 4 | 113 | 19.69 |