Abstract | ||
---|---|---|
A procedure that automatically provides an evaluation of the diagnostic ability of a protein sequence functional pattern is described. The procedure relies on the identification of the closest definable set in terms of a (protein sequence) database functional annotation to the set of database instances containing a given pattern. Assuming annotation correctness and completeness in the protein sequence database, the degree of statistical association between these sets provides an appropriate measure of the diagnostic ability of the pattern. An experimental implementation of the procedure, using the NBRF/PIR protein database, has been applied to a diverse collection of published sequence patterns. Results obtained reveal that frequently it is not possible to define (in NBRF/PIR database terminology) the set of database instances containing a given pattern, suggesting either lack of pattern diagnostic ability or protein database annotation incompleteness and/or inconsistencies. |
Year | DOI | Venue |
---|---|---|
1991 | 10.1093/bioinformatics/7.3.309 | Computer Applications in the Biosciences |
Keywords | Field | DocType |
protein sequence | Data mining,Annotation,Protein sequencing,Computer science,Bioinformatics,Definable set | Journal |
Volume | Issue | ISSN |
7 | 3 | 0266-7061 |
Citations | PageRank | References |
3 | 0.60 | 0 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Roderic Guigó | 1 | 57 | 13.69 |
A Johansson | 2 | 3 | 0.60 |
Temple F. Smith | 3 | 139 | 73.26 |