Title
Fast assignment of protein structures to sequences using the Intermediate Sequence Library PDB-ISL.
Abstract
Motivation: For large-scale structural assignment to sequences, as in computational structural genomics, a fast yet sensitive sequence search procedure is essential. A new approach using intermediate sequences was tested as A shortcut to iterative multiple sequence search methods such as PSI-BLAST. Results: A library containing potential intermediate sequences for proteins of known structure (PDB-ISL) was constructed The sequences in the library were collected from a large sequence database using the sequences of the domains of proteins of known structure as the query sequences and the program PSI-BLAST. Sequences of proteins of unknown structure can be matched to distantly related proteins of known structure by using pairwise sequence comparison methods to find homologues in PDB-ISL. Searches of PDB-ISL were calibrated, and the number of correct matches found at a given error rate was the same as that found by PSI-BLAST. The advantage of this library is that it uses pairwise sequence comparison methods, such as FASTA or BLAST2, and can, therefore, be searched easily and, in many cases, much more quickly than an iterative multiple sequence comparison method. The procedure is roughly 20 times faster than PSI-BLAST for small genomes and several hundred times for large genomes.
Year
DOI
Venue
2000
10.1093/bioinformatics/16.2.117
BIOINFORMATICS
Keywords
Field
DocType
protein structure,structural genomics,error rate
Genome,Sequence alignment,Pairwise comparison,Structural genomics,Sequence database,Computer science,Search procedure,Algorithm,Bioinformatics,Protein Data Bank (RCSB PDB),Protein structure
Journal
Volume
Issue
ISSN
16
2
1367-4803
Citations 
PageRank 
References 
13
2.16
8
Authors
4
Name
Order
Citations
PageRank
Sarah A Teichmann123922.64
Cyrus Chothia21325235.86
George M. Church31527165.03
Jong Park4214.21