Abstract | ||
---|---|---|
This paper addresses the problem of timestamped event sequence matching, a new type of similar sequence matching that retrieves the occurrences of interesting patterns from timestamped sequence databases. The sequential-scan-based method, the trie-based method, and the method based on the iso-depth index are well-known approaches to this problem. In this paper, we point out their shortcomings, and propose a new method that effectively overcomes these shortcomings. The proposed method employs an R∗-tree, a widely accepted multi-dimensional index structure that efficiently supports timestamped event sequence matching. To build the R∗-tree, this method extracts time windows from every item in a timestamped event sequence and represents them as rectangles in n-dimensional space by considering the first and last occurring times of each event type. Here, n is the total number of disparate event types that may occur in a target application. To resolve the dimensionality curse in the case when n is large, we suggest an algorithm for reducing the dimensionality by grouping the event types. Our sequence matching method based on the R∗-tree performs with two steps. First, it efficiently identifies a small number of candidates by searching the R∗-tree. Second, it picks out true answers from the set of candidates. We prove its robustness formally, and also show its effectiveness via extensive experiments. |
Year | DOI | Venue |
---|---|---|
2007 | 10.1016/j.ins.2007.06.020 | Information Sciences |
Keywords | Field | DocType |
Sequence database,Event sequence,Timestamped event sequence matching,Similar sequence matching,Multi-dimensional index | Small number,Multi dimensional,Sequence database,Algorithm,Search engine indexing,Robustness (computer science),Curse of dimensionality,Event sequence,Trie,Mathematics | Journal |
Volume | Issue | ISSN |
177 | 22 | 0020-0255 |
Citations | PageRank | References |
5 | 0.76 | 16 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Sanghyun Park | 1 | 729 | 80.64 |
Jung-Im Won | 2 | 86 | 10.56 |
Jee-Hee Yoon | 3 | 49 | 8.70 |
Sang-Wook Kim | 4 | 792 | 152.77 |