Title
Improving Transition-Based Dependency Parsing of Hindi and Urdu by Modeling Syntactically Relevant Phenomena.
Abstract
In recent years, transition-based parsers have shown promise in terms of efficiency and accuracy. Though these parsers have been extensively explored for multiple Indian languages, there is still considerable scope for improvement by properly incorporating syntactically relevant information. In this article, we enhance transition-based parsing of Hindi and Urdu by redefining the features and feature extraction procedures that have been previously proposed in the parsing literature of Indian languages. We propose and empirically show that properly incorporating syntactically relevant information like case marking, complex predication and grammatical agreement in an arc-eager parsing model can significantly improve parsing accuracy. Our experiments show an absolute improvement of ∼2% LAS for parsing of both Hindi and Urdu over a competitive baseline which uses rich features like part-of-speech (POS) tags, chunk tags, cluster ids and lemmas. We also propose some heuristics to identify ezafe constructions in Urdu texts which show promising results in parsing these constructions.
Year
DOI
Venue
2017
10.1145/3005447
ACM Trans. Asian & Low-Resource Lang. Inf. Process.
Keywords
Field
DocType
Experimentation,Languages Dependency parsing,averaged perceptron,shift-reduce parsing,normalized pointwise mutual information,treebanks
Top-down parsing language,Top-down parsing,S-attributed grammar,Hindi,Computer science,Bottom-up parsing,Dependency grammar,Natural language processing,Artificial intelligence,Parser combinator,Parsing
Journal
Volume
Issue
ISSN
16
3
2375-4699
Citations 
PageRank 
References 
2
0.38
37
Authors
6