Title
Generating high-coverage semantic orientation lexicons from overtly marked words and a thesaurus
Abstract
Sentiment analysis often relies on a semantic orientation lexicon of positive and negative words. A number of approaches have been proposed for creating such lexicons, but they tend to be computationally expensive, and usually rely on significant manual annotation and large corpora. Most of these methods use WordNet. In contrast, we propose a simple approach to generate a high-coverage semantic orientation lexicon, which includes both individual words and multi-word expressions, using only a Roget-like thesaurus and a handful of affixes. Further, the lexicon has properties that support the Polyanna Hypothesis. Using the General Inquirer as gold standard, we show that our lexicon has 14 percentage points more correct entries than the leading WordNet-based high-coverage lexicon (SentiWordNet). In an extrinsic evaluation, we obtain significantly higher performance in determining phrase polarity using our thesaurus-based lexicon than with any other. Additionally, we explore the use of visualization techniques to gain insight into the our algorithm beyond the evaluations mentioned above.
Year
Venue
Keywords
2009
EMNLP
high-coverage semantic orientation lexicon,general inquirer,extrinsic evaluation,gold standard,roget-like thesaurus,polyanna hypothesis,semantic orientation lexicon,thesaurus-based lexicon,correct entry,leading wordnet-based high-coverage lexicon,marked word,sentiment analysis,determiner phrase
Field
DocType
Volume
Information retrieval,Expression (mathematics),Computer science,Sentiment analysis,Manual annotation,Phrase,Lexicon,Natural language processing,Artificial intelligence,WordNet,Creative visualization
Conference
D09-1
Citations 
PageRank 
References 
72
3.93
26
Authors
3
Name
Order
Citations
PageRank
Saif M. Mohammad12103106.31
Cody Dunne243727.88
Bonnie J. Dorr32150176.78