Title
Methods for the Efficient Discovery of Large Item-Indexable Sequential Patterns.
Abstract
An increasingly relevant set of tasks, such as the discovery of biclusters with order-preserving properties, can be mapped as a sequential pattern mining problem on data with item-indexable properties. An item-indexable database, typically observed in biomedical domains, does not allow item repetitions per sequence and is commonly dense. Although multiple methods have been proposed for the efficient discovery of sequential patterns, their performance rapidly degrades over item-indexable databases. The target tasks for these databases benefit from lengthy patterns and tolerate local mismatches. However, existing methods that consider noise relaxations to increase the average short length of sequential patterns scale poorly, aggravating the yet critical efficiency. In this work, we first propose a new sequential pattern mining method, IndexSpan, which is able to mine sequential patterns over item-indexable databases with heightened efficiency. Second, we propose a pattern-merging procedure, MergeIndexBic, to efficiently discover lengthy noise-tolerant sequential patterns. The superior performance of IndexSpan and MergeIndexBic against competitive alternatives is demonstrated on both synthetic and real datasets.
Year
DOI
Venue
2013
10.1007/978-3-319-08407-7_7
Lecture Notes in Artificial Intelligence
Field
DocType
Volume
Data mining,Computer science,Artificial intelligence,Sequential Pattern Mining,Machine learning
Conference
8399
ISSN
Citations 
PageRank 
0302-9743
8
0.43
References 
Authors
21
3
Name
Order
Citations
PageRank
Rui Henriques114312.35
Cláudia Antunes216116.57
Sara C. Madeira3124266.91