Title
Empirical Study of Chinese Text Similarity Computation Based on Machine Translation
Abstract
For the problems of Chinese text similarity calculation based on word frequency statistics, this paper proposed a method by using machine translation to translate Chinese text into English text, indirectly calculate similarity of given texts. This method can avoid some shortcomings of Chinese word segmentation and utilize the advantages of the natural word segmentation of English, and also can use machine translation to indirectly take the semantics of part of words into account. The experiments compared it with the way of directly using Chinese, and a detailed analysis was performed. Experiments show that this method can improve most of social texts' similarity computation as well as increase the accuracy of the computation as a whole.
Year
DOI
Venue
2011
10.1109/SKG.2011.19
SKG
Keywords
Field
DocType
chinese word segmentation,chinese text translation,chinese text similarity calculation,machine translation,natural word segmentation,chinese text,word processing,word frequency statistics,statistical analysis,word frequency statistic,chinese text similarity computation,language translation,detailed analysis,english text,natural language processing,social text,similarity computation,empirical study,text analysis,text similarity,vectors,word segmentation,word frequency,information processing,computational modeling,computer model,information retrieval,semantics
Example-based machine translation,Language translation,Word lists by frequency,Computer science,Machine translation,Text segmentation,Transfer-based machine translation,Natural language processing,Artificial intelligence,Word processing,Semantics
Conference
ISBN
Citations 
PageRank 
978-1-4577-1323-1
0
0.34
References 
Authors
1
4
Name
Order
Citations
PageRank
Yu Xu11185.96
Jianxun Liu2585.07
Mingdong Tang394.06
Yiping Wen4247.31