Abstract | ||
---|---|---|
Sentiment classification has gained much attention in big data era. Most existing methods rely on bag-of-words model, which disregard contextual information. In many cases however, the sentiment strength of a word is implicitly associated with its part of speech and context. In this paper, we present a WWE (weighted word embeddings) method that combines word embeddings and part-of-speech (POS) tagging. First, we used a continuous word representations algorithm (Word2Vec) to train a vector model. The algorithm learns the optimal vectors from the context of surrounding words. According to the cosine similarity between the vector of a word and the vectors of seed words, a polarity score of this word can be calculated. The state-of-the-art SyntaxNet was used for POS tagging. We then computed an overall polarity score of the whole sentence by POS weighted polarity scores of words. At the end, majority voting was applied to determine the final polarity. Our experimental results show that the WWE method is performed with promising outcomes. Additionally, the methodology was demonstrated on the 3 Twitter datasets from different domains. The robustness recommends that this method can be applied on other sentiment classification problems or domains. We also compared the performance on various dimensions of the trained models. A higher dimension achieved a better performance. |
Year | Venue | Keywords |
---|---|---|
2016 | 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | Big Data, Natural Language Processing, NLP, Natural Language Understanding, NLU, Machine Learning, Word Embeddings, Word2Vec, Social Media, Twitter, Cosine Similarity, Sentiment Classification, Part of Speech, SyntaxNet, Parsey McParseface |
Field | DocType | Citations |
Data mining,Cosine similarity,Computer science,Decision support system,Robustness (computer science),Part of speech,Artificial intelligence,Word2vec,Majority rule,Big data,Sentence,Machine learning | Conference | 1 |
PageRank | References | Authors |
0.36 | 28 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Xiangfeng Dai | 1 | 1 | 1.37 |
Bob Prout | 2 | 1 | 0.36 |