Title
Learning Advanced TFBS Models from Chip-Seq Data - diChIPMunk: Effective Construction of Dinucleotide Positional Weight Matrices.
Abstract
Identification and consequent analysis of DNA sequence motifs recognized by transcription factors is an important component in studying transcriptional regulation in higher eukaryotes. In particular, motif discovery methods are applied to construct transcription factor binding sites (TFBSs) models. The TFBS models are then used for prediction of putative binding sites in genomic regions of interest. The most popular TFBS model is a positional weight matrix (PWM). The PWM is usually constructed from nucleotide positional frequencies estimated from a gapless multiple local alignments of experimentally identified TFBS sequences. Modern high-throughput experiments, like ChIP-Seq, provide enough data for careful training of more advanced models having more parameters. Until now, the majority of existing tools for TFBS prediction in ChIP-Seq data still rely on PWMs with independent positions. This is partly explained with only marginal improvement of specificity and sensitivity of TFBS recognition for advanced models over those based on traditional PWMs if trained on ChIP-Seq data. Here we present a novel computational tool, diChIPMunk (http://autosome.ru/dichipmunk/), which can construct dinucleotide PWMs accounting for neighboring nucleotide correlations in input sequences. diChIPMunk retains advantages of the published ChIPMunk algorithm, including usage of ChIP-Seq peak shape and overall computational efficiency. Using public ChIP-Seq data for several TFs we show that carefully trained dinucleotide PWMs perform significantly better as compared to PWMs based on mononucleotide frequencies.
Year
Venue
Keywords
2013
BIOINFORMATICS 2013: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON BIOINFORMATICS MODELS, METHODS AND ALGORITHMS
Motif Discovery,Transcription Factor Binding Sites,TFBS Models,Positional Weight Matrices,PWM,ChIP-Seq,Dinucleotide Composition
Field
DocType
Citations 
Matrix (mathematics),Computer science,Chip,Artificial intelligence,Machine learning
Conference
0
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
Ivan V. Kulakovskiy1596.38
Victor G. Levitsky2577.52
Dmitry G. Oschepkov300.34
Ilya E. Vorontsov4514.53
Vsevolod Makeev5909.70