Title
An Integer Linear Programming Solution for the Domain-Gene-Species Reconciliation Problem.
Abstract
It is well-understood that most eukaryotic genes contain one or more protein domains and that the domain content of a gene can change over time. This change in domain content, through domain duplications, transfers, or losses, has important evolutionary and functional consequences. Recently, a powerful new reconciliation framework, called Domain-Gene-Species (DGS) reconciliation, was introduced to simultaneously model the evolution of a domain family inside one or more gene families and the evolution of those gene families inside a species tree. The underlying computational problem in DGS reconciliation is NP-hard and a heuristic algorithm is currently used to estimate optimal DGS reconciliations. However, this heuristic has several undesirable limitations. First, it offers no guarantee of optimality or near-optimality. Second, it can result in biologically unrealistic evolutionary scenarios. And third, it only computes a single DGS reconciliation even though there can be multiple optimal DGS reconciliations. In this work, we introduce the first exact algorithm for computing optimal DGS reconciliations that addresses all three limitations. Our algorithm is based on an integer linear programming formulation of the problem, which we solve iteratively by solving a series of linear programming relaxations. Our experimental results on over $3,400$ domain trees and over 7,000 gene trees from 12 fly species shows that our new algorithm is highly scalable and that it leads to significant improvement in DGS reconciliation inference. An implementation of our exact algorithm is available freely from http://compbio.engr.uconn.edu/software/seadog/.
Year
Venue
Field
2018
BCB
Computational problem,Mathematical optimization,Heuristic,Exact algorithm,Inference,Heuristic (computer science),Computer science,Integer programming,Linear programming,Artificial intelligence,Machine learning,Scalability
DocType
ISBN
Citations 
Conference
978-1-4503-5794-4
0
PageRank 
References 
Authors
0.34
10
2
Name
Order
Citations
PageRank
Lei Li121.38
Mukul S. Bansal229423.97