Title
Key word extraction for short text via word2vec, doc2vec, and textrank
Abstract
The rapid development of social media encourages people to share their opinions and feelings on the Internet. Every day, a large number of short text comments are generated through Twitter, microblogging, WeChat, etc., and there is high commercial and social value in extracting useful information from these short texts. At present, most studies have focused on extracting text key words. For example, the LDA topic model has good performance with long texts, but it loses effectiveness with short texts because of the noise and sparsity problems. In this paper, we attempt to use Word2Vec and Doc2Vec to improve short-text key word extraction. We first added the method of the collaborative training of word vectors and paragraph vectors and then used the TextRank model's clustering nodes. We adjusted the weights of the key words that were generated by computing the jump probability between nodes and then obtained the node-weighted score, and eventually sorted the generated key words. The experimental results show that the improved method has good performance on the datasets.
Year
DOI
Venue
2019
10.3906/elk-1806-38
TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES
Keywords
DocType
Volume
Key word extraction,short text,word2vec,doc2vec,textrank
Journal
27
Issue
ISSN
Citations 
3.0
1300-0632
1
PageRank 
References 
Authors
0.40
0
5
Name
Order
Citations
PageRank
Jun Li110.73
Guimin Huang269.26
Chunli Fan310.40
Zhenglin Sun410.40
Hongtao Zhu510.40