Title
Extracting replicable associations across multiple studies: Empirical Bayes algorithms for controlling the false discovery rate.
Abstract
In almost every field in genomics, large-scale biomedical datasets are used to report associations. Extracting associations that recur across multiple studies while controlling the false discovery rate is a fundamental challenge. Here, we propose a new method to allow joint analysis of multiple studies. Given a set of p-values obtained from each study, the goal is to identify associations that recur in at least k > 1 studies while controlling the false discovery rate. We propose several new algorithms that differ in how the study dependencies are modeled, and compare them and extant methods under various simulated scenarios. The top algorithm, SCREEN (Scalable Cluster-based REplicability ENhancement), is our new algorithm that works in three stages: (1) clustering an estimated correlation network of the studies, (2) learning replicability (e.g., of genes) within clusters, and (3) merging the results across the clusters. When we applied SCREEN to two real datasets it greatly outperformed the results obtained via standard meta-analysis. First, on a collection of 29 case-control gene expression cancer studies, we detected a large set of consistently up-regulated genes related to proliferation and cell cycle regulation. These genes are both consistently up-regulated across many cancer studies, and are well connected in known gene networks. Second, on a recent pan-cancer study that examined the expression profiles of patients with and without mutations in the HLA complex, we detected a large active module of up-regulated genes that are both related to immune responses and are well connected in known gene networks. This module covers thrice more genes as compared to the original study at a similar false discovery rate, demonstrating the high power of SCREEN. An implementation of SCREEN is available in the supplement.
Year
DOI
Venue
2017
10.1371/journal.pcbi.1005700
PLOS COMPUTATIONAL BIOLOGY
Field
DocType
Volume
Data mining,Biology,Genetic screen,Genomics,Cluster analysis,Bayes' theorem,False discovery rate,Algorithm,Correlation,Bioinformatics,Genetics,Gene regulatory network,Meta-analysis
Journal
13
Issue
ISSN
Citations 
8
1553-7358
1
PageRank 
References 
Authors
0.37
7
3
Name
Order
Citations
PageRank
David Amar1182.12
Ron Shamir23678418.00
Daniel Yekutieli317031.38