Data Quality Controlling for Cross-Lingual Sentiment Classification - Citegraph

Paper Info

Title
Data Quality Controlling for Cross-Lingual Sentiment Classification

Abstract
Cross-lingual sentiment classification aims to perform sentiment classification in a language (named as the target language) with the help of the resources from another language (named as the source language). Previous studies are prone to using all available data in the source language while using all data is observed to perform no better or even worse than using a partion of good data. In this paper, we propose a novel task called data quality controlling in the source language to select high quality samples from the source language. To tackle this task, we propose two kinds of data quality measurements: intra- and extra-quality measurements which are implemented with the certainty and similarity measurements respectively. The empirical studies demonstrate the effectiveness of the proposed approach to data quality controlling in the source language.

Year	DOI	Venue
2013	10.1109/IALP.2013.43	IALP
Keywords	Field	DocType
available data,source language,cross-lingual sentiment,high quality sample,target language,cross-lingual sentiment classification,good data,data quality controlling,data quality measurement,data quality,novel task,sentiment classification,natural language processing	Cache language model,Cross lingual,Data quality,Certainty,Sentiment analysis,Computer science,Artificial intelligence,Language identification,Natural language processing,Empirical research	Conference
Citations	PageRank	References
0	0.34	6
Authors
5

Authors (5 rows)

Cited by (0 rows)

References (6 rows)

Name	Order	Citations	PageRank
Shoushan Li	1	538	52.58
Yunxia Xue	2	2	1.04
Zhong-qing Wang	3	140	20.28
Sophia Yat Mei Lee	4	194	15.89
Chu-Ren Huang	5	600	136.84

1