Title
Probabilities for having a new fold on the basis of a map of all protein sequences
Abstract
It is a major problem in the study of protein structure to predict which proteins have new, currently unknown structural folds. In an attempt to address this problem we studied the location of all proteins with solved structures within the map of all known protein sequences provided by ProtoMap. The mutual distances in this map among solved structures are used to derive a probabilistic model from which we infer an estimate for the probability of an unsolved protein to have a new fold. The probabilities were based on data from SCOP release 1.37. The results were evaluated against the more recent SCOP pre-release 1.41. Our predicted probabilities for unsolved proteins to have a new fold are very well correlated with the proportion of new folds among recently released structures. Thus, information about the structure of proteins can be inferred from a global relational view of protein sequences. Finally, the same procedure was applied to estimate probabilities on the basis of SCOP 1.41. A list of the highest scoring proteins is provided: These are about 80 non-membranous proteins that belong to clusters with more than 5 proteins and achieve the highest probability to have a new fold. A rational selection for 3D determination of those targets is expected to accelerate the pace of new fold discovery.
Year
DOI
Venue
2000
10.1145/332306.332561
RECOMB
Keywords
Field
DocType
non-membranous protein,new fold,highest scoring protein,structure prediction,protein structure,structural genomics,unknown structural fold,scop release,known protein,recent scop pre-release,statistical mode,unsolved protein,protein sequence,global protein organization,clustering,probabilistic model,membrane protein
Structural genomics,Biology,Protomap (neuroscience),Threading (protein sequence),Statistical model,Bioinformatics,Cluster analysis,Statistical Mode,Fold (geology),Protein structure
Conference
ISBN
Citations 
PageRank 
1-58113-186-0
0
0.34
References 
Authors
2
2
Name
Order
Citations
PageRank
Elon Portugaly128625.89
Michal Linial21502149.92