Title
Using de novo protein structure predictions to measure the quality of very large multiple sequence alignments.
Abstract
Motivation: Multiple sequence alignments (MSAs) with large numbers of sequences are now commonplace. However, current multiple alignment benchmarks are ill-suited for testing these types of alignments, as test cases either contain a very small number of sequences or are based purely on simulation rather than empirical data. Results: We take advantage of recent developments in protein structure prediction methods to create a benchmark (ContTest) for protein MSAs containing many thousands of sequences in each test case and which is based on empirical biological data. We rank popular MSA methods using this benchmark and verify a recent result showing that chained guide trees increase the accuracy of progressive alignment packages on datasets with thousands of proteins.
Year
DOI
Venue
2016
10.1093/bioinformatics/btv592
BIOINFORMATICS
Field
DocType
Volume
Small number,Sequence alignment,Data mining,Biological data,Protein structure prediction,Computer science,Software,Test case,Bioinformatics,Multiple sequence alignment,Scripting language
Journal
32
Issue
ISSN
Citations 
6
1367-4803
5
PageRank 
References 
Authors
0.48
15
3
Name
Order
Citations
PageRank
Gearóid Fox150.48
Fabian Sievers2785.49
Desmond G. Higgins31263383.91