Title
Compression ratios based on the Universal Similarity Metric still yield protein distances far from CATH distances
Abstract
Motivation: Kolmogorov complexity has inspired several alignment- free distance measures, based on the comparison of lengths of compres- sions, which have been applied successfully in many areas. One of these measures, the so-called Universal Similarity Metric (USM), has been used by Krasnogor and Pelta to compare simple protein contact maps, showing that it yielded good clustering on four small datasets. Results: We report an extensive test of this metric using a much larger and representative protein dataset: the domain dataset used by Sierk and Pearson to evaluate seven protein structure comparison methods and two protein sequence comparison methods. One result is that Krasnogor-Pelta method has less domain discrimi- nant power than any one of the methods considered by Sierk and Pearson when using these simple contact maps. In another test, we found that the USM based distance has low agree- ment with the CATH tree structure for the same benchmark of Sierk and Pearson. In any case, its agreement is lower than the one of a standard sequential alignment method, SSEARCH.
Year
Venue
Keywords
2006
Clinical Orthopaedics and Related Research
data analysis,protein sequence,quantitative method,tree structure,compression ratio
Field
DocType
Volume
Kolmogorov complexity,Protein structure comparison,Discriminant,Compression ratio,Tree structure,Bioinformatics,Cluster analysis,Mathematics,Distance measures
Journal
abs/q-bio/
Citations 
PageRank 
References 
11
0.69
11
Authors
3
Name
Order
Citations
PageRank
Jairo Rocha1396.56
Francesc Rosselló224429.09
Joan Segura3110.69