On clustering tree structured data with categorical nature - Citegraph

Paper Info

Title
On clustering tree structured data with categorical nature

Abstract
Clustering consists in partitioning a set of objects into disjoint and homogeneous clusters. For many years, clustering methods have been applied in a wide variety of disciplines and they also have been utilized in many scientific areas. Traditionally, clustering methods deal with numerical data, i.e. objects represented by a conjunction of numerical attribute values. However, nowadays commercial or scientific databases usually contain categorical data, i.e. objects represented by categorical attributes. In this paper we present a dissimilarity measure which is capable to deal with tree structured categorical data. Thus, it can be used for extending the various versions of the very popular k-means clustering algorithm to deal with such data. We discuss how such an extension can be achieved. Moreover, we empirically prove that the proposed dissimilarity measure is accurate, compared to other well-known (dis)similarity measures for categorical data.

Year	DOI	Venue
2008	10.1016/j.patcog.2008.05.023	Pattern Recognition
Keywords	Field	DocType
clustering methods deal,scientific area,numerical attribute value,categorical attribute,dissimilarity measure,numerical data,clustering tree,scientific databases,categorical data,clustering method,proposed dissimilarity measure,categorical nature,tree structure,clustering,k means clustering,data mining	Fuzzy clustering,Data mining,CURE data clustering algorithm,Data stream clustering,Correlation clustering,Pattern recognition,Categorical variable,Consensus clustering,Artificial intelligence,Constrained clustering,Cluster analysis,Mathematics	Journal
Volume	Issue	ISSN
41	12	Pattern Recognition
Citations	PageRank	References
11	0.56	25
Authors
2

Authors (2 rows)

Cited by (11 rows)

References (25 rows)

Name	Order	Citations	PageRank
B. Boutsinas	1	82	5.59
T. Papastergiou	2	16	1.33

1