Title
TempoMAGE: a deep learning framework that exploits the causal dependency between time-series data to predict histone marks in open chromatin regions at time-points with missing ChIP-seq datasets
Abstract
Motivation: Identifying histone tail modifications using ChIP-seq is commonly used in time-series experiments in development and disease. These assays, however, cover specific time-points leaving intermediate or early stages with missing information. Although several machine learning methods were developed to predict histone marks, none exploited the dependence that exists in time-series experiments between data generated at specific time-points to extrapolate these findings to time-points where data cannot be generated for lack or scarcity of materials (i.e. early developmental stages). Results: Here, we train a deep learning model named TempoMAGE, to predict the presence or absence of H3K27ac in open chromatin regions by integrating information from sequence, gene expression, chromatin accessibility and the estimated change in H3K27ac state from a reference time-point. We show that adding reference time-point information systematically improves the overall model's performance. In addition, sequence signatures extracted from our method were exclusive to the training dataset indicating that our model learned data-specific features. As an application, TempoMAGE was able to predict the activity of enhancers from pre-validated in-vivo dataset highlighting its ability to be used for functional annotation of putative enhancers.
Year
DOI
Venue
2021
10.1093/bioinformatics/btab513
BIOINFORMATICS
DocType
Volume
Issue
Conference
37
23
ISSN
Citations 
PageRank 
1367-4803
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Mohammad Hallal100.34
Mariette Awad2685.71
Pierre Khoueiry300.68