Title
A heuristic algorithm for clustering rooted ordered trees
Abstract
Recently, tree structures have become a popular way for storing huge amount of data. Clustering these data can facilitate different operations such as storage, retrieval, rule extraction and processing. In this paper, we propose a novel and heuristic algorithm for clustering tree structured data, called TreeCluster. This algorithm considers a representative tree for each cluster. It differs significantly from the traditional methods based on computing tree edit distance. TreeCluster compares each input tree T only with the representative trees of clusters and as a result allows a significant reduction of the running time. We show the efficiency of TreeCluster in terms of time complexity. Furthermore, we empirically evaluate the effectiveness and accuracy of TreeCluster algorithm in comparison with the pervious works. Our experimental results show that TreeCluster improves some cluster quality measures such as intra-cluster similarity, inter-cluster similarity, DUNN and DB.
Year
DOI
Venue
2007
10.3233/IDA-2007-11404
Intell. Data Anal.
Keywords
Field
DocType
cluster quality,tree structure,TreeCluster algorithm,intra-cluster similarity,clustering tree,representative tree,input tree,inter-cluster similarity,heuristic algorithm,time complexity
Data mining,Computer science,K-ary tree,Vantage-point tree,Tree structure,Artificial intelligence,Interval tree,Tree traversal,Pattern recognition,Segment tree,Machine learning,Search tree,Incremental decision tree
Journal
Volume
Issue
ISSN
11
4
1088-467X
Citations 
PageRank 
References 
1
0.35
30
Authors
4
Name
Order
Citations
PageRank
Mostafa Haghir Chehreghani1508.46
Masoud Rahgozar2728.77
Caro Lucas31501103.34
Morteza Haghir Chehreghani411016.07