Title
Prediction of regulatory networks: genome-wide identification of transcription factor targets from gene expression data.
Abstract
Motivation: Defining tegulatory networks, linking transcription factors (TFs) to their targets, is a central problem in post-genomic biology. One might imagine one could readily determine these networks through inspection of gene expression data. However, the relationship between the expression timecourse of a transcription factor and its target is not obvious (e.g. simple correlation over the timecourse), and current analysis methods, such as hierarchical clustering, have not been very successful in deciphering them. Results: Here we introduce an approach based on support vector machines (SVMs) to predict the targets of a transcription factor by identifying subtle relationships between their expression profiles. In particular, we used SVMs to predict the regulatory targets for 36 transcription factors in the Saccharomyces cerevisiae genome based on the microarray expression data from many different physiological conditions. We trained and tested our SVM on a data set constructed to include a significant number of both positive and negative examples, directly addressing data imbalance issues. This was non-trivial given that most of the known experimental information is only for positives. Overall, we found that 63% of our TF-target relationships were confirmed through cross-validation. We further assessed the performance of our regulatory network identifications by comparing them with the results from two recent genome-wide ChIP-chip experiments. Overall, we find the agreement between our results and these experiments is comparable to the agreement (albeit low) between the two experiments. We find that this network has a delocalized structure with respect to chromosomal positioning, with a given transcription factor having targets spread fairly uniformly across the genome.
Year
DOI
Venue
2003
10.1093/bioinformatics/btg347
BIOINFORMATICS
Keywords
Field
DocType
hierarchical clustering,transcription factor,cross validation,chip,support vector machine
Genome,Hierarchical clustering,Data mining,Computer science,Support vector machine,Gene expression,Regulation of gene expression,Proteome,Bioinformatics,Transcription factor,Gene expression profiling
Journal
Volume
Issue
ISSN
19
15.0
1367-4803
Citations 
PageRank 
References 
40
2.47
8
Authors
5
Name
Order
Citations
PageRank
Jiang Qian1402.81
Jimmy Lin24800376.93
Nicholas M Luscombe313511.26
Haiyuan Yu437124.42
Mark Gerstein5503.85