Parallel Two-Phase K-Means - Citegraph

Paper Info

Title
Parallel Two-Phase K-Means

Abstract
In this paper, a new parallel version of Two-Phase K-means, called Parallel Two-Phase K-means (Par2PK-means), is introduced to overcome limits of available parallel versions. Par2PK-means is developed and executed on the MapReduce framework. It is divided into two phases. In the first phase, Mappers independently work on data segments to create an intermediate data. In the second phase, the intermediate data collected from Mappers are clustered by the Reducer to create the final clustering result. Testing on large data sets, the newly proposed algorithm attained a good speedup ratio, closing to the linearly speed-up ratio, when comparing to the sequential version Two-Phase K-means.

Year	DOI	Venue
2013	10.1007/978-3-642-39640-3_16	COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2013, PT V
Keywords	Field	DocType
Data Clustering, K-means, Parallel Distributed Computing, MapReduce	k-means clustering,Data set,Computer science,Parallel computing,Reducer,Cluster analysis,Speedup	Conference
Volume	ISSN	Citations
7975	0302-9743	5
PageRank	References	Authors
0.45	4	3

Authors (3 rows)

Cited by (5 rows)

References (4 rows)

Name	Order	Citations	PageRank
Cuong Nguyen	1	207	35.89
Dung Tien Nguyen	2	18	4.33
Van-Hau Pham	3	41	4.56

1