Big Data Clustering With Kernel K-Means: Resources, Time And Performance - Citegraph

Paper Info

Title
Big Data Clustering With Kernel K-Means: Resources, Time And Performance

Abstract
Data clustering is an unsupervised learning task that has found many applications in various scientific fields. The goal is to find subgroups of closely related data samples (clusters) in a set of unlabeled data. A classic clustering algorithm is the so-called k-Means. It is very popular, however, it is also unable to handle cases in which the clusters are not linearly separable. Kernel k-Means is a state of the art clustering algorithm, which employs the kernel trick, in order to perform clustering on a higher dimensionality space, thus overcoming the limitations of classic k-Means regarding the non-linear separability of the input data. With respect to the challenges of Big Data research, a field that has established itself in the last few years and involves performing tasks on extremely large amounts of data, several adaptations of the Kernel k-Means have been proposed, each of which has different requirements in processing power and running time, while also incurring different trade-offs in performance. In this paper, we present several issues and techniques involving the usage of Kernel k-Means for Big Data clustering and how the combination of each component in a clustering framework fares in terms of resources, time and performance. We use experimental results, in order to evaluate several combinations and provide a recommendation on how to approach a Big Data clustering problem.

Year	DOI	Venue
2018	10.1142/S0218213018600060	INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS
Keywords	DocType	Volume
Big data, kernel k-means, data clustering, approximate kernel k-means, Apache Spark, distributed computation	Journal	27
Issue	ISSN	Citations
4	0218-2130	0
PageRank	References	Authors
0.34	0	4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Nikolaos Tsapanos	1	26	3.87
Anastasios Tefas	2	2055	177.05
Nikolaos Nikolaidis	3	108	10.31
Ioannis Pitas	4	6478	626.09

1