Title
Handling unknown words in statistical latent-variable parsing models for Arabic, English and French
Abstract
This paper presents a study of the impact of using simple and complex morphological clues to improve the classification of rare and unknown words for parsing. We compare this approach to a language-independent technique often used in parsers which is based solely on word frequencies. This study is applied to three languages that exhibit different levels of morphological expressiveness: Arabic, French and English. We integrate information about Arabic affixes and morphotactics into a PCFG-LA parser and obtain state-of-the-art accuracy. We also show that these morphological clues can be learnt automatically from an annotated corpus.
Year
Venue
Keywords
2010
SPMRL@NAACL-HLT
arabic affix,morphological expressiveness,morphological clue,exhibit different level,annotated corpus,statistical latent-variable,pcfg-la parser,state-of-the-art accuracy,complex morphological clue,unknown word,language-independent technique
Field
DocType
Citations 
Word lists by frequency,Arabic,Computer science,Latent variable,Speech recognition,Natural language processing,Artificial intelligence,Parsing,Expressivity
Conference
30
PageRank 
References 
Authors
0.97
14
6
Name
Order
Citations
PageRank
Mohammed Attia114616.51
jennifer foster245438.25
Deirdre Hogan318311.18
Joseph Le Roux417516.34
Lamia Tounsi516710.46
Josef van Genabith61037105.64