Title
Introducing knowledge into differential expression analysis.
Abstract
Gene expression measurements allow determining sets of up-or down-regulated, or unchanged genes in a particular experimental condition. Additional biological knowledge can suggest examples of genes from one of these sets. For instance, known target genes of a transcriptional activator are expected, but are not certain to go down after this activator is knocked out. Available differential expression analysis tools do not take such imprecise examples into account. Here we put forward a novel partially supervised mixture modeling methodology for differential expression analysis. Our approach, guided by imprecise examples, clusters expression data into differentially expressed and unchanged genes. The partially supervised methodology is implemented by two methods: a newly introduced belief-based mixture modeling, and soft-label mixture modeling, a method proved efficient in other applications. We investigate on synthetic data the input example settings favorable for each method. In our tests, both belief-based and soft-label methods prove their advantage over semi-supervised mixture modeling in correcting for erroneous examples. We also compare them to alternative differential expression analysis approaches, showing that incorporation of knowledge yields better performance. We present a broad range of knowledge sources and data to which our partially supervised methodology can be applied. First, we determine targets of Ste12 based on yeast knockout data, guided by a Ste12 DNA-binding experiment. Second, we distinguish miR-1 from miR-124 targets in human by clustering expression data under transfection experiments of both microRNAs, using their computationally predicted targets as examples. Finally, we utilize literature knowledge to improve clustering of time-course expression profiles.
Year
DOI
Venue
2010
10.1089/cmb.2010.0034
JOURNAL OF COMPUTATIONAL BIOLOGY
Keywords
Field
DocType
differential expression analysis,partially supervised mixture modeling
Analysis tools,Differential expression,Mixture modeling,Transcriptional Activator,Regulation of gene expression,Synthetic data,Artificial intelligence,Bioinformatics,Cluster analysis,Mathematics,Machine learning,Gene expression profiling
Journal
Volume
Issue
ISSN
17.0
8
1066-5277
Citations 
PageRank 
References 
2
0.38
17
Authors
4
Name
Order
Citations
PageRank
Ewa Szczurek1496.75
Przemysław Biecek2183.15
Jerzy Tiuryn31210126.00
Martin Vingron41754298.16