Title
pGenTHREADER and pDomTHREADER: new methods for improved protein fold recognition and superfamily discrimination.
Abstract
Motivation: Generation of structural models and recognition of homologous relationships for unannotated protein sequences are fundamental problems in bioinformatics. Improving the sensitivity and selectivity of methods designed for these two tasks therefore has downstream benefits for many other bioinformatics applications. Results: We describe the latest implementation of the GenTHREADER method for structure prediction on a genomic scale. The method combines profile-profile alignments with secondary-structure specific gap-penalties, classic pair-and solvation potentials using a linear combination optimized with a regression SVM model. We find this combination significantly improves both detection of useful templates and accuracy of sequence-structure alignments relative to other competitive approaches. We further present a second implementation of the protocol designed for the task of discriminating superfamilies from one another. This method, pDomTHREADER, is the first to incorporate both sequence and structural data directly in this task and improves sensitivity and selectivity over the standard version of pGenTHREADER and three other standard methods for remote homology detection.
Year
DOI
Venue
2009
10.1093/bioinformatics/btp302
BIOINFORMATICS
Keywords
Field
DocType
twilight zone,information theory
Information theory,Linear combination,Data mining,Regression,Computer science,Support vector machine,Software,Artificial intelligence,Template,Bioinformatics,Machine learning
Journal
Volume
Issue
ISSN
25
14
1367-4803
Citations 
PageRank 
References 
19
1.23
9
Authors
3
Name
Order
Citations
PageRank
Anna E. Lobley1432.99
Michael I. Sadowski2293.22
David T. Jones354635.06