Title
External validity of sentiment mining reports: Can current methods identify demographic biases, event biases, and manipulation of reviews?
Abstract
Many publications in sentiment mining provide new techniques for improved accuracy in extracting features and corresponding sentiments in texts. For the external validity of these sentiment reports, i.e., the applicability of the results to target audiences, it is important to well analyze data of the context of user-generated content and their sample of authors. The literature lacks an analysis of external validity of sentiment mining reports and the sentiment mining field lacks an operationalization of external validity dimensions toward practically useful techniques. From a kernel theory, we identify multiple threats to sentiment mining external validity and study three of them empirically 1) a mismatch in demographics of the reviewers sample, 2) bias due to reviewers' incidental experiences, and 3) manipulation of reviews. The value of external validity threat identifying techniques is next examined in cases from Goodread.com. We conclude that demographic biases can be well detected by current techniques, although we have doubts regarding stylometric techniques for this purpose. We demonstrate the usefulness of event and manipulation bias detection techniques in our cases, but this result needs further replications in more complex and more competitive contexts. Finally, for increasing the decisional usefulness of sentiment mining reports, they should be accompanied by external validity reports and software and service providers in this field should incorporate these in their offerings.
Year
DOI
Venue
2014
10.1016/j.dss.2013.12.005
Decision Support Systems
Keywords
Field
DocType
corresponding sentiment,sentiment mining external validity,sentiment report,external validity report,sentiment mining field,external validity dimension,external validity,sentiment mining report,current method,event bias,sentiment mining,external validity threat,demographic bias,opinion mining
Data science,Data mining,Sentiment analysis,Computer science,Service provider,Demographics,Kernel theory,Operationalization,External validity
Journal
Volume
ISSN
Citations 
59,
0167-9236
2
PageRank 
References 
Authors
0.37
74
2
Name
Order
Citations
PageRank
Fons Wijnhoven115115.68
Oscar Bloemen220.37