Key word extraction for short text via word2vec, doc2vec, and textrank - Citegraph

Paper Info

Title
Key word extraction for short text via word2vec, doc2vec, and textrank

Abstract
The rapid development of social media encourages people to share their opinions and feelings on the Internet. Every day, a large number of short text comments are generated through Twitter, microblogging, WeChat, etc., and there is high commercial and social value in extracting useful information from these short texts. At present, most studies have focused on extracting text key words. For example, the LDA topic model has good performance with long texts, but it loses effectiveness with short texts because of the noise and sparsity problems. In this paper, we attempt to use Word2Vec and Doc2Vec to improve short-text key word extraction. We first added the method of the collaborative training of word vectors and paragraph vectors and then used the TextRank model's clustering nodes. We adjusted the weights of the key words that were generated by computing the jump probability between nodes and then obtained the node-weighted score, and eventually sorted the generated key words. The experimental results show that the improved method has good performance on the datasets.

Year	DOI	Venue
2019	10.3906/elk-1806-38	TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES
Keywords	DocType	Volume
Key word extraction,short text,word2vec,doc2vec,textrank	Journal	27
Issue	ISSN	Citations
3.0	1300-0632	1
PageRank	References	Authors
0.40	0	5

Authors (5 rows)

Cited by (1 rows)

References (0 rows)

Name	Order	Citations	PageRank
Jun Li	1	1	0.73
Guimin Huang	2	6	9.26
Chunli Fan	3	1	0.40
Zhenglin Sun	4	1	0.40
Hongtao Zhu	5	1	0.40

1