Title
A Combinatorial Approach to Genome-Wide Ortholog Assignment: Beyond Sequence Similarity Search
Abstract
The assignment of orthologous genes between a pair of genomes is a fundamental and challenging problem in comparative genomics. Existing methods that assign orthologs based on the similarity between DNA or protein sequences may make erroneous assignments when sequence similarity does not clearly delineate the evolutionary relationship among genes of the same families. In this paper, we present a new approach to ortholog assignment that takes into account both sequence similarity and evolutionary events at genome level, where orthologous genes are assumed to correspond to each other in the most parsimonious evolving scenario under genome rearrangement and gene duplication. It is then formulated as a problem of computing the signed reversal distance with duplicates between two genomes of interest, for which an efficient heuristic algorithm was constructed based on solutions to two new optimization problems, minimum common partition and maximum cycle decomposition. Following this approach, we have implemented a high-throughput system for assigning orthologs on a genome scale, called MSOAR, and tested it on both simulated data and real genome sequence data. Our predicted orthologs between the human and mouse genomes are strongly supported by ortholog and protein function information in authoritative databases, and predictions made by other key ortholog assignment methods such as Ensembl, Homologene, INPARANOID, and HGNC. The simulation results demonstrate that MSOAR in general performs better than the iterated exemplar algorithm of D. Sankoff's in terms of identifying true exemplar genes.This is joint work with X. Chen (Nanyang Tech. Univ., Singapore), Z. Fu (UCR), J. Zheng (NCBI), V. Vacic (UCR), P. Nan (SCBIT), Y. Zhong (SCBIT), and S. Lonardi (UCR).
Year
DOI
Venue
2007
10.1007/978-3-540-73437-6_1
CPM
Keywords
Field
DocType
real genome sequence data,genome rearrangement,beyond sequence similarity search,genome level,key ortholog assignment method,combinatorial approach,assigning orthologs,genome scale,mouse genomes,erroneous assignment,sequence similarity search,orthologous gene,sequence similarity,genome-wide ortholog assignment,optimization problem,gene duplication,high throughput,heuristic algorithm,genome sequence,protein sequence,comparative genomics
HUGO Gene Nomenclature Committee,Genome,Combinatorics,Computer science,Ensembl,Inparanoid,HomoloGene,Comparative genomics,Whole genome sequencing,Computational biology,Genetics,Nearest neighbor search
Conference
ISBN
Citations 
PageRank 
3-540-73436-8
0
0.34
References 
Authors
1
1
Name
Order
Citations
PageRank
Tao Jiang11809155.32