Title | ||
---|---|---|
Determining Case in Arabic: Learning Complex Linguistic Behavior Requires Complex Linguistic Features. |
Abstract | ||
---|---|---|
This paper discusses automatic determina- tion of case in Arabic. This task is a ma- jor source of errors in full diacritization of Arabic. We use a gold-standard syntac- tic tree, and obtain an error rate of about 4.2%, with a machine learning based system outperforming a system using hand-written rules. A careful error analysis suggests that when we account for annotation errors in the gold standard, the error rate drops to 0.8%, with the hand-written rules outperforming the machine learning-based system. |
Year | Venue | Keywords |
---|---|---|
2007 | EMNLP-CoNLL | machine learning,gold standard,error rate |
Field | DocType | Volume |
Annotation,Arabic,Computer science,Word error rate,Speech recognition,Natural language processing,Artificial intelligence,Generalization error,Syntax,Machine learning | Conference | D07-1 |
Citations | PageRank | References |
12 | 0.64 | 4 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Nizar Habash | 1 | 1833 | 145.59 |
Ryan Gabbard | 2 | 83 | 7.45 |
Owen Rambow | 3 | 2256 | 247.69 |
Seth Kulick | 4 | 221 | 29.66 |
Mitchell P. Marcus | 5 | 3098 | 854.76 |