Abstract | ||
---|---|---|
The presented paper describes the design and implementation of distributed k-means clustering algorithm for text documents analysis. Motivation for the research effort presented in this paper is to propose a distributed approach based on current in-memory distributed computing technologies. We have used our Jbowl java text mining library and GridGain as a framework for distributed computing. Using these technologies we have designed and implemented k-means distributed clustering algorithm in two modifications and performed the experiments on the standard text data collections. Experiments were conducted in two testing environments-a distributed computing infrastructure and on a multi-core server. |
Year | DOI | Venue |
---|---|---|
2015 | 10.1007/978-3-319-28561-0_13 | INFORMATION SYSTEMS ARCHITECTURE AND TECHNOLOGY, ISAT 2015, PT II |
Keywords | DocType | Volume |
Clustering,Text mining,k-Means,Distributed computing | Conference | 430 |
ISSN | Citations | PageRank |
2194-5357 | 0 | 0.34 |
References | Authors | |
0 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Martin Sarnovsky | 1 | 9 | 3.26 |
Noema Carnoka | 2 | 0 | 0.34 |