Title
Is it really that difficult to parse German?
Abstract
This paper presents a comparative study of probabilistic treebank parsing of German, using the Negra and TüBa-D/Z tree-banks. Experiments with the Stanford parser, which uses a factored PCFG and dependency model, show that, contrary to previous claims for other parsers, lexicalization of PCFG models boosts parsing performance for both treebanks. The experiments also show that there is a big difference in parsing performance, when trained on the Negra and on the TüBa-D/Z treebanks. Parser performance for the models trained on TüBa-D/Z are comparable to parsing results for English with the Stanford parser, when trained on the Penn treebank. This comparison at least suggests that German is not harder to parse than its West-Germanic neighbor language English.
Year
Venue
Keywords
2006
EMNLP
factored pcfg,parser performance,parsing result,pcfg model,parsing performance,z treebanks,probabilistic treebank,penn treebank,z tree-banks,stanford parser
Field
DocType
Volume
Top-down parsing language,Top-down parsing,Computer science,Simple LR parser,Speech recognition,Bottom-up parsing,Natural language processing,Artificial intelligence,Treebank,Parser combinator,Parsing,German
Conference
W06-16
ISBN
Citations 
PageRank 
1-932432-73-6
13
0.87
References 
Authors
12
3
Name
Order
Citations
PageRank
Sandra Kübler15613.29
Erhard W. Hinrichs220445.42
Wolfgang Maier3130.87