Abstract | ||
---|---|---|
This paper describes a distributed calculation scheme for scoring relationship among documents. This scheme categorizes documents by using an algorithm which calculates a score value for the relationship between a category and a word in a document. The longer calculation time becomes when increasing the number of documents. Therefore, our scheme uses multiple machines. A master node divides a document set into several subsets, and it distributes them to each calculation nodes. Using this distributed calculation makes the calculation time short, and also makes the memory usage low. |
Year | DOI | Venue |
---|---|---|
2017 | 10.1109/AINA.2017.99 | 2017 IEEE 31st International Conference on Advanced Information Networking and Applications (AINA) |
Keywords | Field | DocType |
Distributed system,Parallel calculation,Text mining,Document clustering,Document categorization | Categorization,Data mining,Text mining,Information retrieval,Computer science,Information science,Cluster analysis | Conference |
ISSN | ISBN | Citations |
1550-445X | 978-1-5090-6030-6 | 1 |
PageRank | References | Authors |
0.43 | 6 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Masaki Kohana | 1 | 31 | 14.06 |
Hiroki Sakaji | 2 | 30 | 17.97 |
Akio Kobayashi | 3 | 4 | 5.73 |
Shusuke Okamoto | 4 | 65 | 28.98 |