Title
Contribution to terminology internationalization by word alignment in parallel corpora.
Abstract
Creating a complete translation of a large vocabulary is a time-consuming task, which requires skilled and knowledgeable medical translators. Our goal is to examine to which extent such a task can be alleviated by a specific natural language processing technique, word alignment in parallel corpora. We experiment with translation from English to French.Build a large corpus of parallel, English-French documents, and automatically align it at the document, sentence and word levels using state-of-the-art alignment methods and tools. Then project English terms from existing controlled vocabularies to the aligned word pairs, and examine the number and quality of the putative French translations obtained thereby. We considered three American vocabularies present in the UMLS with three different translation statuses: the MeSH, SNOMED CT, and the MedlinePlus Health Topics.We obtained several thousand new translations of our input terms, this number being closely linked to the number of terms in the input vocabularies.Our study shows that alignment methods can extract a number of new term translations from large bodies of text with a moderate human reviewing effort, and thus contribute to help a human translator obtain better translation coverage of an input vocabulary. Short-term perspectives include their application to a corpus 20 times larger than that used here, together with more focused methods for term extraction.
Year
Venue
Keywords
2006
AMIA
unified medical language system,computer science,systematized nomenclature of medicine,natural language processing
Field
DocType
ISSN
MedlinePlus,Terminology,Computer science,Systematized Nomenclature of Medicine,Controlled vocabulary,Natural language processing,Artificial intelligence,SNOMED CT,Unified Medical Language System,Vocabulary,Sentence
Conference
1942-597X
Citations 
PageRank 
References 
3
0.44
9
Authors
3
Name
Order
Citations
PageRank
Louise Deleger123420.13
Magnus Merkel215225.52
Pierre Zweigenbaum377385.43