Title
Efficient Approaches for Solving the Large-Scale k-Medoids Problem: Towards Structured Data.
Abstract
The possibility of clustering objects represented by structured data with possibly non-trivial geometry certainly is an interesting task in pattern recognition. Moreover, in the Big Data era, the possibility of clustering huge amount of (structured) data challenges computer science and pattern recognition researchers alike. The aim of this paper is to bridge the gap on large-scale structured data clustering. Specifically, following a previous work, in this paper a parallel and distributed k-medoids clustering implementation is proposed and tested on real-world biological structured data, namely pathway maps (graphs) and primary structure of proteins (sequences). Furthermore, two methods for medoids' evaluation are proposed and compared in terms of scalability, based on exact and approximate procedures, respectively. Computational results show that the proposed implementation is flexible with respect to the dissimilarity measure and the input space adopted, with satisfactory results in terms of scalability.
Year
DOI
Venue
2017
10.1007/978-3-030-16469-0_11
Studies in Computational Intelligence
Keywords
Field
DocType
Cluster analysis,Parallel and distributed computing,Large-scale pattern recognition,Unsupervised learning,Big Data mining,Non-metric spaces analysis
Graph,Data mining,Computer science,Unsupervised learning,Cluster analysis,k-medoids,Big data,Data model,Medoid,Scalability
Conference
Volume
ISSN
Citations 
829.0
1860-949X
0
PageRank 
References 
Authors
0.34
0
3
Name
Order
Citations
PageRank
Alessio Martino143.48
Antonello Rizzi236341.68
Fabio Massimo Frattale Mascioli35215.07