Title
Inferring Correlation Between Database Queries: Analysis of Protein Sequence Patterns
Abstract
Given a subset P of a database, the problem of finding the query phi in a given database attribute having the closest extension to P is addressed. In the particular case that is outlined, P is the set of protein sequences in a protein sequence database matching a given protein sequence pattern, whereas phi is a query in the annotation of the database. Ideally, phi is the description of a biological function. If the extension of phi is very similar to P, an association between the pattern and the biological function described by the query may be inferred. An algorithm that efficiently searches the query space when negation is not considered is developed. Since the query language is a first-order language, the query space may be mapped into a set algebra in which a measure of stochastic dependence-an asymptotic approximation of the correlation coefficient-is used as a measure of set similarity. The algorithm uses the algebraic properties of such a measure to reduce the time required to search the query space. A prototype implementation of the algorithm has been tested in different collections of protein sequence patterns.
Year
DOI
Venue
1993
10.1109/34.254060
Pattern Analysis and Machine Intelligence, IEEE Transactions  
Keywords
Field
DocType
set similarity,subset p,query phi,query language,biological function,protein sequence pattern,inferring correlation,set algebra,query space,protein sequence database,protein sequence patterns,protein sequence,database queries,data analysis,pattern analysis,set theory,proteins,database theory,stochastic processes,sequences,first order,indexing terms,algebra,databases,cancer,helium,molecular biology
Query optimization,Query language,Algebra of sets,Sequence database,Computer science,Sargable,View,Theoretical computer science,Database theory,Database,Boolean conjunctive query
Journal
Volume
Issue
ISSN
15
10
0162-8828
Citations 
PageRank 
References 
4
0.85
4
Authors
2
Name
Order
Citations
PageRank
Roderic Guigó123128.29
Temple F. Smith213973.26