Title
Transliterated Named Entity Recognition Based on Chinese Word Sketch
Abstract
One of the unique challenges to Chinese Language Processing is cross-strait named entity recognition. Due to the adoption of different transliteration strategies, foreign name transliterations can vary greatly be- tween PRC and Taiwan. This situation poses a serious problem for NLP tasks: including data mining, translation and information re- trieval. In this paper, we introduce a novel approach to automatic extraction of diver- gent transliterations of foreign named enti- ties by bootstrapping co-occurrence statis- tics from tagged Chinese corpora. In this study, we use Chinese Word Sketch The automatically bootstrapped translitera- tion pairs are further screened based on pho- netic similarity. The precision is evaluated to be more than 90% against manually cor- rected transliteration pairs.
Year
DOI
Venue
2008
10.1142/S1793840608001780
International Journal of Computer Processing of Languages
Keywords
Field
DocType
data mining,transliteration
Entity linking,Word sketch,Computer science,Natural language processing,Artificial intelligence,Named-entity recognition,Transliteration
Journal
Citations 
PageRank 
References 
0
0.34
2
Authors
4
Name
Order
Citations
PageRank
Petr Simon122.87
Chu-Ren Huang2600136.84
Shu-kai Hsieh34721.47
Jia-Fei Hong4189.06