Title
Ordered subtree mining via transactional mapping using a structure-preserving tree database schema
Abstract
Frequent subtree mining is a major research topic in knowledge discovery from tree-structured data, whose importance is witnessed by the pervasiveness of such data in several domains. In this paper, we present a novel approach to discover all the frequent ordered subtrees in a tree-structured database. A key aspect is that the structural aspects of the input tree instances are extracted to generate a transactional format that enables the application of standard itemset mining techniques. In this way, the expensive process of subtree enumeration is avoided, while subtrees can be reconstructed in a post-processing stage. As a result, more structurally complex tree data can be handled and much lower support thresholds can be used. In addition to discovering traditional subtrees, this is the first approach to frequent subtree mining that can discover position-constrained subtrees. Each node in the position-constrained subtree is annotated with its exact occurrence and level of embedding in the original database tree. Also, disconnected subtree associations can be represented via virtual connecting nodes. Experiments conducted on synthetic and real-world datasets confirm the expected advantages of our approach over competing methods in terms of efficiency, mining capabilities, and informativeness of the extracted patterns.
Year
DOI
Venue
2015
10.1016/j.ins.2015.03.015
Inf. Sci.
Keywords
Field
DocType
frequent subtree mining,position-constrained subtree discovery,semistructured data,transactional representation
Data mining,Embedding,Computer science,T-tree,Tree (data structure),Enumeration,Frequent subtree mining,Database schema,Artificial intelligence,Knowledge extraction,Transactional leadership,Machine learning
Journal
Volume
Issue
ISSN
310
C
0020-0255
Citations 
PageRank 
References 
1
0.35
34
Authors
3
Name
Order
Citations
PageRank
Fedja Hadzic117515.55
Michael Hecker2131.47
Andrea Tagarelli347552.29