Title
Unlock Big Data Emotions: Weighted Word Embeddings For Sentiment Classification
Abstract
Sentiment classification has gained much attention in big data era. Most existing methods rely on bag-of-words model, which disregard contextual information. In many cases however, the sentiment strength of a word is implicitly associated with its part of speech and context. In this paper, we present a WWE (weighted word embeddings) method that combines word embeddings and part-of-speech (POS) tagging. First, we used a continuous word representations algorithm (Word2Vec) to train a vector model. The algorithm learns the optimal vectors from the context of surrounding words. According to the cosine similarity between the vector of a word and the vectors of seed words, a polarity score of this word can be calculated. The state-of-the-art SyntaxNet was used for POS tagging. We then computed an overall polarity score of the whole sentence by POS weighted polarity scores of words. At the end, majority voting was applied to determine the final polarity. Our experimental results show that the WWE method is performed with promising outcomes. Additionally, the methodology was demonstrated on the 3 Twitter datasets from different domains. The robustness recommends that this method can be applied on other sentiment classification problems or domains. We also compared the performance on various dimensions of the trained models. A higher dimension achieved a better performance.
Year
Venue
Keywords
2016
2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA)
Big Data, Natural Language Processing, NLP, Natural Language Understanding, NLU, Machine Learning, Word Embeddings, Word2Vec, Social Media, Twitter, Cosine Similarity, Sentiment Classification, Part of Speech, SyntaxNet, Parsey McParseface
Field
DocType
Citations 
Data mining,Cosine similarity,Computer science,Decision support system,Robustness (computer science),Part of speech,Artificial intelligence,Word2vec,Majority rule,Big data,Sentence,Machine learning
Conference
1
PageRank 
References 
Authors
0.36
28
2
Name
Order
Citations
PageRank
Xiangfeng Dai111.37
Bob Prout210.36