Abstract | ||
---|---|---|
In this paper, we study a frequent substructure discovery problem in semi-structured data. We present an efficient algorithm Unot that computes all frequent labeled unordered trees appearing in a large collection of data trees with frequency above a user-specified threshold. The keys of the algorithm are efficient enumeration of all unordered trees in canonical form and incremental computation of their occurrences. We then show that Unot discovers each frequent pattern T in O(kb(2)m) per pattern, where k is the size of T, b is the branching factor of the data trees, and m is the total number of occurrences of T in the data trees. |
Year | DOI | Venue |
---|---|---|
2003 | 10.1007/978-3-540-39644-4_6 | Lecture Notes in Artificial Intelligence |
Keywords | Field | DocType |
semi structured data,canonical form | Data mining,Tree (graph theory),Computer science,Artificial intelligence,Weight-balanced tree,Computation,Discrete mathematics,Branching factor,Enumeration,Frequent subtree mining,Canonical form,Knowledge extraction,Machine learning | Conference |
Volume | ISSN | Citations |
2843 | 0302-9743 | 93 |
PageRank | References | Authors |
4.86 | 18 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Tatsuya Asai | 1 | 349 | 34.73 |
Hiroki Arimura | 2 | 1130 | 92.90 |
Takeaki Uno | 3 | 1319 | 107.99 |
Shin-ichi Nakano | 4 | 246 | 24.40 |