Title
Stream-based active learning for sentiment analysis in the financial domain.
Abstract
Studying the relationship between public sentiment and stock prices has been the focus of several studies. This paper analyzes whether the sentiment expressed in Twitter feeds, which discuss selected companies and their products, can indicate their stock price changes. To address this problem, an active learning approach was developed and applied to sentiment analysis of tweet streams in the stock market domain. The paper first presents a static Twitter data analysis problem, explored in order to determine the best Twitter-specific text preprocessing setting for training the Support Vector Machine (SVM) sentiment classifier. In the static setting, the Granger causality test shows that sentiments in stock-related tweets can be used as indicators of stock price movements a few days in advance, where improved results were achieved by adapting the SVM classifier to categorize Twitter posts into three sentiment categories of positive, negative and neutral (instead of positive and negative only). These findings were adopted in the development of a new stream-based active learning approach to sentiment analysis, applicable in incremental learning from continuously changing financial tweet streams. To this end, a series of experiments was conducted to determine the best querying strategy for active learning of the SVM classifier adapted to sentiment analysis of financial tweet streams. The experiments in analyzing stock market sentiments of a particular company show that changes in positive sentiment probability can be used as indicators of the changes in stock closing prices.
Year
DOI
Venue
2014
10.1016/j.ins.2014.04.034
Information Sciences
Keywords
Field
DocType
Predictive sentiment analysis,Stream-based active learning,Stock market,Twitter,Positive sentiment probability,Granger causality
Data mining,Granger causality,Artificial intelligence,Classifier (linguistics),Stock market,Categorization,Active learning,Sentiment analysis,Support vector machine,Preprocessor,Finance,Mathematics,Machine learning
Journal
Volume
Issue
ISSN
285
C
0020-0255
Citations 
PageRank 
References 
58
1.47
38
Authors
4
Name
Order
Citations
PageRank
Jasmina Smailovic11457.80
Miha Grcar222415.71
Nada Lavrac32004635.45
Martin Znidarsic413311.51