Title
Multi-Feature Based Chinese-English Named Entity Extraction From Comparable Corpora
Abstract
Bilingual Named Entity Extraction is,important to some cross language information processes such as machine translation (MT), cross-lingual information retrieval (CLIR), etc. A lot of previous work extracted bilingual Named Entities from parallel corpus. Here we propose a multi-feature based method to extract bilingual Named Entities from comparable corpus. We first recognize the, Chinese and English Named Entities respectively from the Chinese and English part of the comparable corpus. Then all the feature scores are calculated for every possible pair of Chinese and English Named Entities. At last we combine these feature scores together and decide which pairs are mutual translations. For translation score calculation, we didn't use the formula of IBM model I like previous approach. In stead, we used a modified edit distance to take the order of words into consideration. Experiment shows that-the F-score of this method increased by 11%. And with the multi-feature integration strategy encouraging results are obtained.http://www.aclweb.org/anthology/Y06-1018
Year
DOI
Venue
2006
null
PACLIC 20: PROCEEDINGS OF THE 20TH PACIFIC ASIA CONFERENCE ON LANGUAGE, INFORMATION AND COMPUTATION
Field
DocType
Volume
Edit distance,IBM,Information retrieval,Computer science,Machine translation,Named entity,Artificial intelligence,Natural language processing,Feature based
Conference
null
Issue
ISSN
Citations 
null
null
2
PageRank 
References 
Authors
0.36
5
2
Name
Order
Citations
PageRank
Min Lu120.36
Jun Zhao22119115.52