Title
Developing LMF-XML Bilingual Dictionaries for Colloquial Arabic Dialects.
Abstract
The Linguistic Data Consortium and Georgetown University Press are collaborating to create updated editions of bilingual dictionaries that had originally been published in the 1960's for English-speaking learners of Moroccan, Syrian and Iraqi Arabic. In their first editions, these dictionaries used ad hoc Latin-alphabet orthography for each colloquial Arabic dialect, but adopted some properties of Arabic-based writing (collation order of Arabic headwords, clitic attachment to word forms in example phrases); despite their common features, there are notable differences among the three books that impede comparisons across the dialects, as well as comparisons of each dialect to Modern Standard Arabic. In updating these volumes, we use both Arabic script and International Phonetic Alphabet orthographies; the former provides a common basis for word recognition across dialects, while the latter provides dialect-specific pronunciations. Our goal is to preserve the full content of the original publications, supplement the Arabic headword inventory with new usages, and produce a uniform lexicon structure expressible via the Lexical Markup Framework (LMF, ISO 24613). To this end, we developed a relational database schema that applies consistently to each dialect, and HTTP-based tools for searching, editing, workflow, review and inventory management.
Year
Venue
Keywords
2012
LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION
Arabic dialect,bilingual dictionary,lexical markup
Field
DocType
Citations 
Linguistic Data Consortium,Computer science,Lexical Markup Framework,Orthography,Modern Standard Arabic,Lexicon,Natural language processing,Artificial intelligence,International Phonetic Alphabet,Headword,Arabic script
Conference
3
PageRank 
References 
Authors
0.38
0
2
Name
Order
Citations
PageRank
David Graff17123.77
Mohamed Maamouri211213.34