Title
An alignment-free model for comparison of regulatory sequences.
Abstract
Some recent comparative studies have revealed that regulatory regions can retain function over large evolutionary distances, even though the DNA sequences are divergent and difficult to align. It is also known that such enhancers can drive very similar expression patterns. This poses a challenge for the in silico detection of biologically related sequences, as they can only be discovered using alignment-free methods.Here, we present a new computational framework called Regulatory Region Scoring (RRS) model for the detection of functional conservation of regulatory sequences using predicted occupancy levels of transcription factors of interest. We demonstrate that our model can detect the functional and/or evolutionary links between some non-alignable enhancers with a strong statistical significance. We also identify groups of enhancers that are likely to be similarly regulated. Our model is motivated by previous work on prediction of expression patterns and it can capture similarity by strong binding sites, weak binding sites and even the statistically significant absence of sites. Our results support the hypothesis that weak binding sites contribute to the functional similarity of sequences. Our model fills a gap between two families of models: detailed, data-intensive models for the prediction of precise spatio-temporal expression patterns on the one side, and crude, generally applicable models on the other side. Our model borrows some of the strengths of each group and addresses their drawbacks.The RRS source code is freely available upon publication of this manuscript: http://www2.warwick.ac.uk/fac/sci/systemsbiology/staff/ott/tools_and_software/rrs.
Year
DOI
Venue
2010
10.1093/bioinformatics/btq453
Bioinformatics
Keywords
Field
DocType
functional similarity,data-intensive model,strong binding site,expression pattern,non-alignable enhancers,functional conservation,regulatory sequence,similar expression pattern,applicable model,weak binding site,precise spatio-temporal expression pattern,alignment-free model
Sequence alignment,Data mining,Source code,Computer science,DNA sequencing,Bioinformatics,Enhancer,In silico,Regulatory sequence
Journal
Volume
Issue
ISSN
26
19
1367-4811
Citations 
PageRank 
References 
0
0.34
8
Authors
5
Name
Order
Citations
PageRank
Hashem Koohy100.68
Nigel P Dyer200.34
John E Reid362.68
Georgy Koentges4151.12
Sascha Ott5435.89