Title
Determining Case in Arabic: Learning Complex Linguistic Behavior Requires Complex Linguistic Features.
Abstract
This paper discusses automatic determina- tion of case in Arabic. This task is a ma- jor source of errors in full diacritization of Arabic. We use a gold-standard syntac- tic tree, and obtain an error rate of about 4.2%, with a machine learning based system outperforming a system using hand-written rules. A careful error analysis suggests that when we account for annotation errors in the gold standard, the error rate drops to 0.8%, with the hand-written rules outperforming the machine learning-based system.
Year
Venue
Keywords
2007
EMNLP-CoNLL
machine learning,gold standard,error rate
Field
DocType
Volume
Annotation,Arabic,Computer science,Word error rate,Speech recognition,Natural language processing,Artificial intelligence,Generalization error,Syntax,Machine learning
Conference
D07-1
Citations 
PageRank 
References 
12
0.64
4
Authors
5
Name
Order
Citations
PageRank
Nizar Habash11833145.59
Ryan Gabbard2837.45
Owen Rambow32256247.69
Seth Kulick422129.66
Mitchell P. Marcus53098854.76