Title
DACA: Distributed adaptive grid decision graph based clustering algorithm
Abstract
Clustering algorithms play a very important role in machine learning. With the development of big-data artificial intelligence, distributed parallel algorithms have become an important research field. To reduce the computational complexity and running time of large-scale datasets in the clustering process, this study proposes a distributed clustering algorithm DACA (distributed adaptive grid decision graph based clustering algorithm). In a distributed environment, DACA uses relative entropy to adaptively mesh the data to form an obvious sparse grid and dense grid. Then, the decision graph is used to determine the cluster center mesh object. Finally, the KD-tree is used to accelerate the determination of the cluster center of sparse points to complete clustering. The algorithm is implemented using the popular Apache Spark computing framework, compared with other distributed clustering algorithms, DACA can adaptively divide the grid according to the data distribution to obtain better clustering effect. At the same time, KD tree algorithm is used to speed up the decision-making of clustering center. Numerous experiments show that the DACA algorithm has excellent performance and accuracy on six standard datasets and real GPS trajectory datasets.
Year
DOI
Venue
2022
10.1002/spe.3060
SOFTWARE-PRACTICE & EXPERIENCE
Keywords
DocType
Volume
adaptive grid division, clustering algorithms, decision graphs, distributed, KD-tree
Journal
52
Issue
ISSN
Citations 
5
0038-0644
0
PageRank 
References 
Authors
0.34
0
4
Name
Order
Citations
PageRank
Jing He136248.04
Jun Zhou243.47
Haoyu Wang300.34
Li Cai400.34