Title
A novel DBSCAN with entropy and probability for mixed data.
Abstract
In big data situation, to detect clusters of different size and shape is a challenging and imperative task. Density based clustering approaches have been widely used in many areas of science due to its simplicity and the ability to detect clusters of different sizes and shapes over the last several years. With diverse conversion on categorical data, a modified version of the DBSCAN algorithm is proposed to cluster mixed data, noted as density based clustering algorithm for mixed data with integration of entropy and probability distribution (EPDCA). Optional and various conversions are provided for clustering process with adaptability. Some benchmark data sets from UCI have been selected for testing the capability and validity of EPDCA. It was shown that the clustering results of EPDCA are considerably improved, especially on automatically number of clusters formed, noise discovery and time elapsed to form clusters.
Year
DOI
Venue
2017
10.1007/s10586-017-0818-3
Cluster Computing
Keywords
Field
DocType
Distance measure, Density-based clustering, Conversion, Entropy, Mixed data
OPTICS algorithm,Data mining,CURE data clustering algorithm,Correlation clustering,Pattern recognition,Computer science,Determining the number of clusters in a data set,SUBCLU,Artificial intelligence,Cluster analysis,DBSCAN,Single-linkage clustering
Journal
Volume
Issue
ISSN
20
2
1386-7857
Citations 
PageRank 
References 
4
0.38
24
Authors
3
Name
Order
Citations
PageRank
Xingxing Liu140.72
Qing Yang24825.86
Ling He3526.94