Title
Extraction of Ambiguous Sequential Patterns with Least Minimum Generalization from Mismatch Clusters
Abstract
An ambiguous query in sequence databases returns a set of similar subsequences, called a mismatch cluster, to the user. The inherent problem is that it is difficult for users to identify the characteristics of very large similar subsequences in a mismatch cluster. In order to support user comprehension of mismatch clusters, it is important to extract a set of ambiguous sequence patterns with the least minimum generalization in the mismatch cluster. The extraction of the ambiguous sequential pattern set requires an enormous amount of computational time, since we have to discover generalized patterns with minimum covers for the mismatch cluster from candidate generalized patterns. The present paper is a proposal for an iterative refinement method to extract ambiguous sequence patterns with minimum cover for mismatch clusters selected from a sequence database. It includes a proposal to use the method with a domain segmentation method to achieve an efficient pattern extraction. Moreover, a prototype implementing the two proposed methods has been applied to three datasets included in PROSITE in order to evaluate their usefulness. The proposed methods resulted in a high capability to extract ambiguous sequential patterns from mismatch clusters that are provided by an ambiguous query in the sequence database.
Year
DOI
Venue
2007
10.1109/SITIS.2007.104
SITIS
Keywords
Field
DocType
mismatch clusters,minimum generalization,iterative refinement method,ambiguous sequential pattern,ambiguous sequence pattern,domain segmentation method,sequence database,mismatch cluster,ambiguous sequential patterns,sequence databases,ambiguous query,ambiguous sequential pattern set,barium,iterative methods,artificial intelligence,database management systems,bismuth
Iterative refinement,Cluster (physics),Data mining,Sequence database,Pattern recognition,Segmentation,Iterative method,Computer science,Pattern clustering,Artificial intelligence,PROSITE
Conference
Citations 
PageRank 
References 
2
0.38
5
Authors
5
Name
Order
Citations
PageRank
Kotaro Araki131.07
Keiichi Tamura23713.86
Tomoyuki Kato3808.59
Yasuma Mori4219.89
H. Kitakami59449.68