Abstract | ||
---|---|---|
Concept hierarchies are important for generalization in many data mining applications. We propose a method to automatically build a concept hierarchy from a provided distance matrix. The method is a modification of traditional agglomerative hierarchical clustering algorithm. When two closest clusters are selected for combining into a new cluster, the algorithm either creates a new cluster with the two original clusters as its sub-clusters, or let a cluster join the other without creating a new cluster at the higher level of the hierarchy. For the purpose of algorithm evaluation, a distance matrix is derived from the concept hierarchy built by algorithm. Root mean squared error between the provided distant matrix and the derived distance matrix is used as evaluation criterion. Empirical results show that the traditional algorithm under complete link strategy performs better than the other strategies, our algorithms perform almost the same under the three strategies, and our algorithms perform better than the traditional algorithms under various situations. |
Year | DOI | Venue |
---|---|---|
2006 | 10.2991/jcis.2006.142 | JCIS |
Keywords | Field | DocType |
data mining,concept hierarchy,hierarchical clustering,distance matrix,root mean square error | Hierarchical clustering,k-medians clustering,Fuzzy clustering,Data mining,Canopy clustering algorithm,CURE data clustering algorithm,Computer science,Hierarchical clustering of networks,Cluster analysis,Single-linkage clustering | Conference |
Citations | PageRank | References |
4 | 0.41 | 11 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Huang-Cheng Kuo | 1 | 42 | 23.87 |
Tsung-Han Tsai | 2 | 340 | 66.41 |
Jen-Peng Huang | 3 | 57 | 6.45 |