Title
A novel approach for proper name transliteration verification
Abstract
Proper name transliteration, the pronunciation based translation of a proper name, is important to many multilingual natural language processing task, such as Statistical Machine Translation (SMT) and Cross Lingual Information Retrieval (CLIR). This task is extremely challenging due to the pronunciation difference between the source and target language. A given proper name can lead to many different transliterations. In the past, research efforts had demonstrated a 30-50% error using top-1 reference for transliteration. This error leads to performance degradation for many applications. In this paper, a novel approach to verify a given proper name transliteration pair using a discrete variant Hidden Markov Model (HMM) alignment is proposed. The state emission probabilities are derived from SMT phrase tables. The proposed method yields an Equal Error Rate (EER) of 3.73% on a 300 matched and 1000 unmatched name pairs test set. By comparison, the commonly used SMT framework yields 6.5% EER under the best configuration. The widely used edit distance approach has an EER of 22%. Our new method achieves high accuracy and low complexity, and provides an alternative for name transliteration in CLIR and other cross lingual natural language applications such as word alignment and machine translation.
Year
DOI
Venue
2010
10.1109/ISCSLP.2010.5684842
ISCSLP
Keywords
Field
DocType
multilingual natural language processing task,smt phrase tables,information retrieval,translteration,cross lingual information retrieval,state emission probabilities,equal error rate,language translation,proper name transliteration verification,pronunciation based translation,natural language processing,cross lingual ir,hidden markov models,component,machine translation,probability,discrete variant hidden markov model,natural language,computational modeling,edit distance,noise measurement,proper names,hidden markov model,decoding,kernel
Edit distance,Language translation,Computer science,Machine translation,Artificial intelligence,Natural language processing,Pattern recognition,Word error rate,Speech recognition,Natural language,Hidden Markov model,Proper noun,Transliteration
Conference
ISBN
Citations 
PageRank 
978-1-4244-6244-5
0
0.34
References 
Authors
16
5
Name
Order
Citations
PageRank
Jan, E.-E.114839.33
Niyu Ge219521.69
Shih-Hsiang Lin314214.07
Salim Roukos46248845.50
Jeffrey S. Sorensen515416.12