Extraction of Name and Transliteration in Monolingual and Parallel Corpora - Citegraph

Paper Info

Title
Extraction of Name and Transliteration in Monolingual and Parallel Corpora

Abstract
Named-entities in free text represent a challenge to text analysis in Machine Translation and Cross Language Information Retrieval. These phrases are often transliterated into another language with a different sound inventory and writing system. Named-entities found in free text are often not listed in bilingual dictionaries. Although it is possible to identify and translate named-entities on the fly without a list of proper names and transliterations, an extensive list of existing transliterations certainly will ensure high precision rate. We use a seed list of proper names and transliterations to train a Machine Transliteration Model. With the model it is possible to extract proper names and their transliterations in monolingual or parallel corpora with high precision and recall rates.

Year	DOI	Venue
2004	10.1007/978-3-540-30194-3_20	Lecture Notes in Computer Science
Keywords	Field	DocType
machine translation,proper names,text analysis	Bilingual dictionary,Computer science,Multilingualism,Precision and recall,Machine translation,Computational linguistics,Speech recognition,Artificial intelligence,Natural language processing,Proper noun,Cross-language information retrieval,Transliteration	Conference
Volume	ISSN	Citations
3265	0302-9743	3
PageRank	References	Authors
0.50	4	3

Authors (3 rows)

Cited by (3 rows)

References (4 rows)

Name	Order	Citations	PageRank
Tracy Lin	1	13	2.77
Jian-Cheng Wu	2	70	13.30
Jason S. Chang	3	345	62.64

1