Title
Discriminative discovery of transcription factor binding sites from location data.
Abstract
The availability of genome-wide location analyses based on chromatin immunoprecipitation (ChIP) data gives a new insight for in silico analysis of transcriptional regulations.We propose a novel discriminative discovery framework for precisely identifying transcriptional regulatory motifs from both positive and negative samples (sets of upstream sequences of both bound and unbound genes by a transcription factor (TF)) based on the genome-wide location data. In this framework, our goal is to find such discriminative motifs that best explain the location data in the sense that the motifs precisely discriminate the positive samples from the negative ones. First, in order to discover an initial set of discriminative substrings between positive and negative samples, we apply a decision tree learning method which produces a text-classification tree. We extract several clusters consisting of similar substrings from the internal nodes of the learned tree. Second, we start with initial profile-HMMs constructed from each cluster for representing putative motifs and iteratively refine the profile-HMMs to improve the discrimination accuracies. Our genome-wide experimental results on yeast show that our method successfully identifies the consensus sequences for known TFs in the literature and further presents significant performances for discriminating between positive and negative samples in all the TFs, while most other motif detecting methods show very poor performances on the problem of discriminations. Our learned profile-HMMs also improve false negative predictions of ChIP data.
Year
DOI
Venue
2005
10.1109/CSB.2005.30
CSB
Keywords
Field
DocType
false negative prediction,negative sample,genome-wide location data,genome-wide experimental result,chip data,location data,positive sample,discriminative motif,discriminative discovery,transcription factor,decision tree,discriminative substrings,microorganisms,transcription factor binding site,decision tree learning,genetics,hidden markov models,chip,chromatin immunoprecipitation,molecular biophysics,decision trees
Decision tree,Substring,DNA binding site,Computer science,Artificial intelligence,Bioinformatics,Hidden Markov model,Consensus sequence,Discriminative model,Decision tree learning,Machine learning,In silico
Conference
ISSN
ISBN
Citations 
1551-7497
0-7695-2344-7
0
PageRank 
References 
Authors
0.34
2
2
Name
Order
Citations
PageRank
Yuji Kawada110.73
Yasubumi Sakakibara276962.91