Cluster Correction on Polysemy and Synonymy - Citegraph

Paper Info

Title
Cluster Correction on Polysemy and Synonymy

Abstract
Document clustering (or text clustering) is the application of cluster analysis to textual documents. It has applications in automatic document organization, topic extraction and fast information retrieval or filtering. At the same time, there are still many challenges, for example the accuracy of clustering needs to be improved. In this regard, the process of cluster correction becomes the object of analysis. In this paper, we focus on the polysemy and synonymy issue in clustering process. Polysemy represents the ambiguity of an individual word or phrase that can be used (in different contexts) to express two or more different meanings. However, synonymy is the semantic relation that holds between two or more words that can (in a given context) express the same meaning. These two conditions will affect our results of clustering. In order that, we use bag of words model to distinguish contexts of the same words and word2vec to re-cluster word with the similar meaning. Cosine similarity is also use to measure of similarity between two nonzero vectors in these two model.

Year	DOI	Venue
2017	10.1109/WISA.2017.45	2017 14th Web Information Systems and Applications Conference (WISA)
Keywords	Field	DocType
cluster correction,polysemy,synonymy,cosine similarity,bag of words,word2vec	Bag-of-words model,Cosine similarity,Document clustering,Computer science,Phrase,Artificial intelligence,Natural language processing,Word2vec,Cluster analysis,Semantics,Polysemy	Conference
ISBN	Citations	PageRank
978-1-5386-4807-0	0	0.34
References	Authors
0	4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Zemin Qin	1	2	1.37
Hao Lian	2	9	3.93
Tieke He	3	58	15.85
Bin Luo	4	66	21.04

1