Title
Spoken Arabic dialect identification using phonotactic modeling
Abstract
The Arabic language is a collection of multiple variants, among which Modern Standard Arabic (MSA) has a special status as the formal written standard language of the media, culture and education across the Arab world. The other variants are informal spoken dialects that are the media of communication for daily life. Arabic dialects differ substantially from MSA and each other in terms of phonology, morphology, lexical choice and syntax. In this paper, we describe a system that automatically identifies the Arabic dialect (Gulf, Iraqi, Levantine, Egyptian and MSA) of a speaker given a sample of his/her speech. The phonotactic approach we use proves to be effective in identifying these dialects with considerable overall accuracy --- 81.60% using 30s test utterances.
Year
Venue
Keywords
2009
SEMITIC@EACL
multiple variant,phonotactic approach,arab world,phonotactic modeling,formal written standard language,spoken arabic dialect identification,arabic dialect,arabic language,daily life,modern standard arabic,considerable overall accuracy,lexical choice,computer science
Field
DocType
Citations 
Lexical choice,Phonotactics,Standard language,Computer science,Modern Standard Arabic,Natural language processing,Arabic languages,Artificial intelligence,Phonology,Syntax,Linguistics,Modern Arabic mathematical notation
Conference
37
PageRank 
References 
Authors
2.42
10
3
Name
Order
Citations
PageRank
Fadi Biadsy120715.14
Julia Hirschberg22982448.62
Nizar Habash31833145.59