Title
HamleDT: Harmonized multi-language dependency treebank
Abstract
We present HamleDT--a HArmonized Multi-LanguagE Dependency Treebank. HamleDT is a compilation of existing dependency treebanks (or dependency conversions of other treebanks), transformed so that they all conform to the same annotation style. In the present article, we provide a thorough investigation and discussion of a number of phenomena that are comparable across languages, though their annotation in treebanks often differs. We claim that transformation procedures can be designed to automatically identify most such phenomena and convert them to a unified annotation style. This unification is beneficial both to comparative corpus linguistics and to machine learning of syntactic parsing.
Year
DOI
Venue
2014
10.1007/s10579-014-9275-2
Language Resources and Evaluation
Keywords
Field
DocType
Dependency treebank,Annotation scheme,Harmonization
Syntactic parsing,Annotation,Information retrieval,Unification,Computer science,Corpus linguistics,Artificial intelligence,Treebank,Natural language processing,Multi language
Journal
Volume
Issue
ISSN
48
4
1574-020X
Citations 
PageRank 
References 
15
0.91
24
Authors
8
Name
Order
Citations
PageRank
daniel zeman143437.62
Ondřej Dušek218023.08
David Marecek31148.57
Martin Popel426921.27
Loganathan Ramasamy5533.67
Jan Stepánek6867.37
Zdenek Zabokrtský719322.23
Jan Hajic81184109.62