Title
Multilingual Sentiment Analysis Using Emoticons and Keywords
Abstract
Nowadays the World Wide Web has evolved into a leading communication channel and information exchange medium. Especially after the introduction of the so-called web 2.0 and the explosion that followed regarding user generated content, the amount of data available over the internet has attracted the interest of both the scientific and business community. Their efforts focus on identifying the inner structures of data and the knowledge that can be derived by analyzing them. Web 2.0 is the subject of study and research in a number of areas. One of these areas is sentiment analysis, where the main goal is to study and draw conclusions about subjectivity, polarity and the feeling that is expressed in user generated content, which mainly consist of free text documents. The goal of this paper is to apply sentiment analysis on multilingual data, focusing on documents written in Greek. We developed an integrated framework that accepts user generated documents and then identifies the polarity of the text (neutral, negative or positive) and the sentiment expressed through it (joy, love, anger or sadness). We followed a semi-supervised approach which led to the development of two techniques for the automatic collection of training data without any human intervention. Our approach involves the detection and use of self-defining features that are available within the data. We take into account two emotionally rich features: a) emoticons and b) lists of emotionally intense keywords. These features are evaluated on data coming from a popular forum, using various classifiers and feature vectors. Our experimental results point to various conclusions about the effectiveness, advantages and limitations of applying such methods on Greek data. Using keywords we achieved 90% mean accuracy on identifying the subjectivity level and 93% on correctly identifying the polarity level, whereas using emoticons the mean accuracy for each of these levels was 74% and 77% respectively.
Year
DOI
Venue
2014
10.1109/WI-IAT.2014.86
IAT), 2014 IEEE/WIC/ACM International Joint Conferences  
Keywords
Field
DocType
learning (artificial intelligence),natural language processing,pattern classification,text analysis,Greek data,Greek documents,Internet,Web 2.0,World Wide Web,business community,classifiers,communication channel,data inner structures,emoticons,emotionally intense keywords,feature vectors,information exchange medium,multilingual data,multilingual sentiment analysis,scientific community,self-defining features,semisupervised learning,text polarity,user generated content,user generated documents,Automatic Collection of Training Data,Emoticons,Forum,Greek,Keywords,Semi Supervised Learning,Sentiment Analysis
User-generated content,Sadness,Feature vector,Semi-supervised learning,Information retrieval,Computer science,Sentiment analysis,Information exchange,Support vector machine,The Internet
Conference
Volume
ISBN
Citations 
2
978-1-4799-4143-8-02
5
PageRank 
References 
Authors
0.43
11
3
Name
Order
Citations
PageRank
Georgios S. Solakidis150.43
Konstantinos N. Vavliakis2597.89
Pericles A. Mitkas3254.31