Title
Information theoretic approaches to whole genome phylogenies
Abstract
We describe a novel method for efficient reconstruction of phylogenetic trees, based on sequences of whole genomes or proteomes. The core of our method is a new measure of pairwise distances between sequences, whose lengths may greatly vary. This measure is based on information theoretic tools (Kullback-Leibler relative entropy). We present an algorithm for efficiently computing these distances. The algorithm uses suffix arrays to compute the distance of two ℓ long sequences in O(ℓ) time. It is fast enough to enable the construction of the phylogenomic tree for hundreds of species, and the phylogenomic forest for almost two thousand viruses. An initial analysis of the results exhibits a remarkable agreement with “acceptable phylogenetic truth”. To assess our approach, it was implemented together with a number of alternative approaches, including two that were previously published in the literature. Comparing their outcome to ours, using a “traditional” tree and a standard tree comparison method, our algorithm improved upon the “competition” by a substantial margin.
Year
DOI
Venue
2005
10.1007/11415770_22
RECOMB
Keywords
Field
DocType
new measure,novel method,kullback-leibler relative entropy,phylogenomic forest,acceptable phylogenetic truth,whole genome phylogeny,phylogenomic tree,efficient reconstruction,alternative approach,phylogenetic tree,information theoretic approach,standard tree comparison method,divergence,kullback leibler,maximum likelihood method,relative entropy,distance matrix,phylogenomics
Genome,Pairwise comparison,Phylogenetic tree,Biology,Suffix,Distance matrix,Bioinformatics,Phylogenomics,Kullback–Leibler divergence
Conference
Volume
ISSN
ISBN
3500
0302-9743
3-540-25866-3
Citations 
PageRank 
References 
4
0.49
18
Authors
4
Name
Order
Citations
PageRank
David Burstein1744.44
Igor Ulitsky229912.96
Tamir Tuller342647.25
Benny Chor42981520.40