Title
Text feature selection for sentiment classification of Chinese online reviews.
Abstract
In order to meet the requirement of customised services for online communities, sentiment classification of online reviews has been applied to study the unstructured reviews so as to identify users' opinions on certain products. The purpose of this article is to select features for sentiment classification of Chinese online reviews with techniques well performed in traditional text classification. First, adjectives, adverbs and verbs are identified as the potential text features containing sentiment information. Then, four statistical feature selection methods, such as document frequency (DF), information gain (IG), chi-squared statistic (CHI) and mutual information (MI), are adopted to select features. After that, the Boolean weighting method is applied to set feature weights and construct a vector space model. Finally, a support vector machine (SVM) classifier is employed to predict the sentiment polarity of online reviews. Comparative experiments are conducted based on hotel online reviews in Chinese. The results indicate that the highest accuracy of the sentiment classification of Chinese online reviews is achieved by taking adjectives, adverbs and verbs together as the feature. Besides that, different feature selection methods make distinct performances on sentiment classification, as DF performs the best, CHI follows and IG ranks the last, whereas MI is not suitable for sentiment classification of Chinese online reviews. This conclusion will be helpful to improve the accuracy of sentiment classification and be useful for further research.
Year
DOI
Venue
2013
10.1080/0952813X.2012.721139
JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE
Keywords
Field
DocType
feature selection method,text classification,sentiment classification,Chinese online reviews
Weighting,Information retrieval,Statistic,Feature selection,Sentiment analysis,Computer science,Support vector machine,Mutual information,Vector space model,Classifier (linguistics)
Journal
Volume
Issue
ISSN
25.0
4
0952-813X
Citations 
PageRank 
References 
10
0.57
13
Authors
4
Name
Order
Citations
PageRank
Hongwei Wang17813.92
Pei Yin2413.30
Jiani Yao3100.57
James N. K. Liu452944.35