Probing the Randomness of Proteins by Their Subsequence Composition - Citegraph

Paper Info

Title
Probing the Randomness of Proteins by Their Subsequence Composition

Abstract
The quantitative underpinning of the information contents of biosequences represents an elusive goal and yet also an obvious prerequisite to the quantitative modeling and study of biological function and evolution. Previous studies have consistently exposed a tenacious lack of compressibility on behalf of biosequences. This leaves the question open as to what distinguishes them from random strings, the latter being clearly unpalatable to the living cell. This paper assesses the randomness of biosequences in terms on newly introduced parameters that relate to the vocabulary of their (suitably constrained) subsequences rather than their substrings. Results from experiments show the potential of the method in distinguishing a protein sequence from its random reshuffling, as well as in tasks of classification and clustering.

Year	DOI	Venue
2009	10.1109/DCC.2009.60	DCC
Keywords	Field	DocType
random reshuffling,living cell,biological function,previous study,quantitative underpinning,information content,subsequence composition,elusive goal,random string,obvious prerequisite,quantitative modeling,clustering,probability density function,dna,data mining,data compression,classification,protein sequence,organisms,construction industry,proteins,genetics	Substring,Computer science,Living cell,Theoretical computer science,Construction industry,Subsequence,Cluster analysis,Probability density function,Vocabulary,Randomness	Conference
ISSN	Citations	PageRank
1068-0314	0	0.34
References	Authors
4	2

Authors (2 rows)

Cited by (0 rows)

References (4 rows)

Name	Order	Citations	PageRank
Alberto Apostolico	1	1441	182.20
Fabio Cunial	2	72	9.68

1