Title
Skewed base compositions, asymmetric transition matrices, and phylogenetic invariants.
Abstract
Evolutionary inference methods that assume equal DNA base compositions and symmetric nucleotide substitution matrices, where these assumptions do not hold, are likely to group species on the basis of similar base compositions rather than true phylogenetic relationships. We propose an invariants-based method for dealing with this problem. An invariant QT of a tree T under a k-state Markov model, where a generalized time parameter is identified with the E edges of T, allows us to recognize whether data on N observed species can be associated with the N terminal vertices of T in the sense of having been generated on T rather than on any other tree with N terminals. The form of the generalized time parameter is a positive determinant matrix in some semigroup S of stochastic matrices. The invariance is with respect to the choice of the set of E matrices in S, one associated with each of the E edges of T. We apply a general "empirical" method of finding invariants of a parametrized functional form. It involves calculating the probability f of all KN data possibilities for each of m sets of E matrices in S to associate with the edges of T, then solving for the parameters using the m equations of form Q(f) = 0. We discuss the problems of finding asymmetric models satisfying the property of semigroup closure, of finding asymmetric models that admit invariants at all, and of the computational complexity of the method. We propose a class of semigroups Sc containing matrices of form [formula: see text] to account for A+T versus G+C asymmetries in DNA base composition. Quadratic invariants are obtained for rooted trees with three and with four terminals. In the latter case the smallest set of algebraically independent invariants is sought. These invariants are applied to data pertaining the fungal evolution and to the origin of mitochondria as bacterial endosymbionts.
Year
Venue
Field
1994
Journal of Computational Biology
Discrete mathematics,Combinatorics,Phylogenetic tree,Parametrization,Invariant (physics),Vertex (geometry),Markov model,Matrix (mathematics),Invariant (mathematics),Semigroup,Mathematics
DocType
Volume
Issue
Journal
1
1
ISSN
Citations 
PageRank 
1066-5277
2
0.83
References 
Authors
0
3
Name
Order
Citations
PageRank
Vincent Ferretti16312.62
B. Franz Lang210316.99
David Sankoff31590240.76