Title
Joint learning of gene functions--a Bayesian network model approach.
Abstract
In this paper, we develop a machine learning system for determining gene functions from heterogeneous data sources using a Weighted Naive Bayesian network (WNB). The knowledge of gene functions is crucial for understanding many fundamental biological mechanisms such as regulatory pathways, cell cycles and diseases. Our major goal is to accurately infer functions of putative genes or Open Reading Frames (ORFs) from existing databases using computational methods. However, this task is intrinsically difficult since the underlying biological processes represent complex interactions of multiple entities. Therefore, many functional links would be missing when only one or two sources of data are used in the prediction. Our hypothesis is that integrating evidence from multiple and complementary sources could significantly improve the prediction accuracy. In this paper, our experimental results not only suggest that the above hypothesis is valid, but also provide guidelines for using the WNB system for data collection, training and predictions. The combined training data sets contain information from gene annotations, gene expressions, clustering outputs, keyword annotations, and sequence homology from public databases. The current system is trained and tested on the genes of budding yeast Saccharomyces cerevisiae. Our WNB model can also be used to analyze the contribution of each source of information toward the prediction performance through the weight training process. The contribution analysis could potentially lead to significant scientific discovery by facilitating the interpretation and understanding of the complex relationships between biological entities.
Year
DOI
Venue
2006
10.1142/S0219720006001928
J. Bioinformatics and Computational Biology
Keywords
Field
DocType
machine learning,bayesian network
Data mining,Gene,Expression (mathematics),Biology,Mechanism (biology),Artificial intelligence,Cluster analysis,Data collection,Naive Bayes classifier,Bayesian network,Bioinformatics,Gene Annotation,Machine learning
Journal
Volume
Issue
ISSN
4
2
0219-7200
Citations 
PageRank 
References 
2
0.37
11
Authors
3
Name
Order
Citations
PageRank
Xutao Deng1868.22
Huimin Geng2377.02
Hesham H. Ali327647.48