Title
TreeLign: simultaneous stepwise alignment and phylogenetic positioning, with its application to automatic phylogenetic assignment of 16S rRNAs
Abstract
Phylogenetic assignment of 16s rRNA has been frequently used for taxonomic classification. Recently, high-throughput sequencing, especially in the context of environmental or metagenomic sequencing projects, has made fast and accurate taxonomic classification an important goal. Existing classification methods are either fast, but too coarse-grained and inaccurate or fine-grained and accurate but too slow for use in practice. In this paper, we propose a new computational method, TreeLign, to rapidly and accurately conduct alignment and phylogenetic assignments for novel sequences, given a reference phylogenetic tree and an alignment. TreeLign first constructs profiles of every branch on the reference tree, then, for each query sequence, tries assigning it to every possible branch, and finally obtains a new tree and a new alignment which are jointly optimal in terms of Maximum Parsimony (MP). We tested the accuracy and robustness of TreeLign on both a large and a small 16S rRNA dataset extracted from the core set of GreenGenes. The results on the large dataset show that the assignments of TreeLign are in general consistent with the phylogenetic tree of the core set of GreenGenes. And, the results on the small dataset show that TreeLign achieves comparable accuracy compared with existing maximum likelihood based methods, but requires much less computational time.
Year
DOI
Venue
2011
10.1145/2147805.2147868
BCB
Keywords
Field
DocType
reference tree,phylogenetic assignment,new tree,new alignment,large dataset show,accurate taxonomic classification,simultaneous stepwise alignment,automatic phylogenetic assignment,phylogenetic positioning,phylogenetic tree,reference phylogenetic tree,existing classification method,new computational method,taxonomic classification,16s rrna,maximum likelihood,high throughput,maximum parsimony
Data mining,Biological classification,Maximum parsimony,Phylogenetic tree,Pattern recognition,Tree rearrangement,Biology,Maximum likelihood,Robustness (computer science),Metagenomics,Artificial intelligence,Computational phylogenetics
Conference
Citations 
PageRank 
References 
0
0.34
8
Authors
3
Name
Order
Citations
PageRank
Yuan Li121.43
Aaron L. Halpern25514.04
Shaojie Zhang320328.81