Title
CoMeBack: DNA Methylation Array Data Analysis for Co-Methylated Regions.
Abstract
Motivation: High-dimensional DNA methylation (DNAm) array coverage, while sparse in the context of the entire DNA methylome, still constitutes a very large number of CpG probes. The ensuing multiple-test corrections affect the statistical power to detect associations, likely contributing to prevalent limited reproducibility. Array probes measuring proximal CpG sites often have correlated levels of DNAm that may not only be biologically meaningful but also imply statistical dependence and redundancy. New methods that account for such correlations between adjacent probes may enable improved specificity, discovery and interpretation of statistical associations in DNAm array data. Results: We developed a method named Co-Methylation with genomic CpG Background (CoMeBack) that estimates DNA co-methylation, defined as proximal CpG probes with correlated DNAm across individuals. CoMeBack outputs co-methylated regions (CMRs), spanning sets of array probes constructed based on all genomic CpG sites, including those not measured on the array, and without any phenotypic variable inputs. This approach can reduce the multiple-test correction burden, while enhancing the discovery and specificity of statistical associations. We constructed and validated CMRs in whole blood, using publicly available Illumina Infinium 450 K array data from over 5000 individuals. These CMRs were enriched for enhancer chromatin states, and binding site motifs for several transcription factors involved in blood physiology. We illustrated how CMR-based epigenome-wide association studies can improve discovery and reduce false positives for associations with chronological age.
Year
DOI
Venue
2020
10.1093/bioinformatics/btaa049
BIOINFORMATICS
DocType
Volume
Issue
Journal
36
9
ISSN
Citations 
PageRank 
1367-4803
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Evan Gatev100.34
Nicole Gladish200.34
Sara Mostafavi300.34
Michael S Kobor400.34