Title
Selective integration of multiple biological data for supervised network inference.
Abstract
Inferring networks of proteins from biological data is a central issue of computational biology. Most network inference methods, including Bayesian networks, take unsupervised approaches in which the network is totally unknown in the beginning, and all the edges have to be predicted. A more realistic supervised framework, proposed recently, assumes that a substantial part of the network is known. We propose a new kernel-based method for supervised graph inference based on multiple types of biological datasets such as gene expression, phylogenetic profiles and amino acid sequences. Notably, our method assigns a weight to each type of dataset and thereby selects informative ones. Data selection is useful for reducing data collection costs. For example, when a similar network inference problem must be solved for other organisms, the dataset excluded by our algorithm need not be collected.First, we formulate supervised network inference as a kernel matrix completion problem, where the inference of edges boils down to estimation of missing entries of a kernel matrix. Then, an expectation-maximization algorithm is proposed to simultaneously infer the missing entries of the kernel matrix and the weights of multiple datasets. By introducing the weights, we can integrate multiple datasets selectively and thereby exclude irrelevant and noisy datasets. Our approach is favorably tested in two biological networks: a metabolic network and a protein interaction network.Software is available on request.
Year
DOI
Venue
2005
10.1093/bioinformatics/bti339
Bioinformatics
Keywords
Field
DocType
multiple biological data,selective integration,protein interaction network,network inference method,similar network inference problem,metabolic network,missing entry,supervised network inference,multiple datasets,inferring network,bayesian network,biological network,kernel matrix,algorithms,gene expression regulation,signal transduction,transcription factors,biological data,artificial intelligence,systems integration,database management systems,computer simulation
Data mining,Biological data,Computer science,Artificial intelligence,Biological network inference,Kernel (linear algebra),Inference,Biological network,Interaction network,Bayesian network,Bioinformatics,Kernel method,Machine learning
Journal
Volume
Issue
ISSN
21
10
1367-4803
Citations 
PageRank 
References 
35
1.71
7
Authors
3
Name
Order
Citations
PageRank
Tsuyoshi Kato1764.26
Koji Tsuda21664122.25
Kiyoshi Asai384679.20