Title
Data Quality Controlling for Cross-Lingual Sentiment Classification
Abstract
Cross-lingual sentiment classification aims to perform sentiment classification in a language (named as the target language) with the help of the resources from another language (named as the source language). Previous studies are prone to using all available data in the source language while using all data is observed to perform no better or even worse than using a partion of good data. In this paper, we propose a novel task called data quality controlling in the source language to select high quality samples from the source language. To tackle this task, we propose two kinds of data quality measurements: intra- and extra-quality measurements which are implemented with the certainty and similarity measurements respectively. The empirical studies demonstrate the effectiveness of the proposed approach to data quality controlling in the source language.
Year
DOI
Venue
2013
10.1109/IALP.2013.43
IALP
Keywords
Field
DocType
available data,source language,cross-lingual sentiment,high quality sample,target language,cross-lingual sentiment classification,good data,data quality controlling,data quality measurement,data quality,novel task,sentiment classification,natural language processing
Cache language model,Cross lingual,Data quality,Certainty,Sentiment analysis,Computer science,Artificial intelligence,Language identification,Natural language processing,Empirical research
Conference
Citations 
PageRank 
References 
0
0.34
6
Authors
5
Name
Order
Citations
PageRank
Shoushan Li153852.58
Yunxia Xue221.04
Zhong-qing Wang314020.28
Sophia Yat Mei Lee419415.89
Chu-Ren Huang5600136.84