Title
Towards a New Approach for Empowering the MR-DBSCAN Clustering for Massive Data Using Quadtree
Abstract
Multiple emerging technologies like social networks and IoT generate huge amounts of data on daily basis. This leads us to analyze and cluster this data, so we can uncover hidden values and patterns. DBSCAN is a powerful clustering algorithm which detects patterns by clustering data based on its density, it classifies each point as a core point, border point or a noise. DBSCAN is already used in many applications like retail business, medical imaging and text mining. However, the existence of advanced networks and sophisticated machines increased the need to switch traditional clustering algorithms from single node to parallel nodes environment. In our paper, we present a solution to parallelize DBSCAN by using Quadtree data structure. Our solution distributes the dataset into smaller chunks, then it utilizes the parallel programming frameworks such as Map-Reduce to provide an infrastructure to store and process these small chunks of data. We use various training sets to evaluate the performance of both traditional DBSCAN and our Map-Reduce DBSCAN prototype. We analyze our solution in terms of time complexity, efficiency, scalability, value and accuracy. Our analysis illustrates the benefits of using parallelized DBSCAN clustering, it shows the usefulness of managing subsets of data using Quadtree data structure.
Year
DOI
Venue
2018
10.1109/HPCC/SmartCity/DSS.2018.00044
2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)
Keywords
Field
DocType
DBSCAN,Map-Reduce,Quadtree,Parallel Clustering Algorithm
Data structure,Data mining,Computer science,Internet of Things,Cluster analysis,Time complexity,DBSCAN,Quadtree,Distributed computing,Scalability
Conference
ISBN
Citations 
PageRank 
978-1-5386-6615-9
0
0.34
References 
Authors
0
2
Name
Order
Citations
PageRank
Rami Ibrahim111.37
M. Omair Shafiq213918.59