Title
Efficient Mining of Gap-Constrained Subsequences and Its Various Applications
Abstract
Mining frequent subsequence patterns is a typical data-mining problem and various efficient sequential pattern mining algorithms have been proposed. In many application domains (e.g., biology), the frequent subsequences confined by the predefined gap requirements are more meaningful than the general sequential patterns. In this article, we propose two algorithms, Gap-BIDE for mining closed gap-constrained subsequences from a set of input sequences, and Gap-Connect for mining repetitive gap-constrained subsequences from a single input sequence. Inspired by some state-of-the-art closed or constrained sequential pattern mining algorithms, the Gap-BIDE algorithm adopts an efficient approach to finding the complete set of closed sequential patterns with gap constraints, while the Gap-Connect algorithm efficiently mines an approximate set of long patterns by connecting short patterns. We also present several methods for feature selection from the set of gap-constrained patterns for the purpose of classification and clustering. Our extensive performance study shows that our approaches are very efficient in mining frequent subsequences with gap constraints, and the gap-constrained pattern based classification/clustering approaches can achieve high-quality results.
Year
DOI
Venue
2012
10.1145/2133360.2133362
TKDD
Keywords
Field
DocType
approximate set,gap-constrained pattern,frequent subsequence,various applications,complete set,mining algorithm,gap constraint,gap-constrained subsequences,general sequential pattern,efficient mining,gap-constrained subsequence,closed sequential pattern,sequential pattern mining algorithm,data mining,feature selection,sequential pattern mining
Data mining,Pattern recognition,Feature selection,Computer science,Artificial intelligence,Cluster analysis,Subsequence,Sequential Pattern Mining,Machine learning
Journal
Volume
Issue
ISSN
6
1
1556-4681
Citations 
PageRank 
References 
27
0.81
35
Authors
4
Name
Order
Citations
PageRank
Chun Li1270.81
Qingyan Yang21378.63
Jianyong Wang35295230.18
Ming Li45595829.00