Title
Lemmatization and Morphological Tagging in German and Latin: A Comparison and a Survey of the State-of-the-art.
Abstract
This paper relates to the challenge of morphological tagging and lemmatization in morphologically rich languages by example of German and Latin. We focus on the question what a practitioner can expect when using state-of-the-art solutions out of the box. Moreover, we contrast these with old(er) methods and implementations for POS tagging. We examine to what degree recent efforts in tagger development pay out in improved accuracies - and at what cost, in terms of training and processing time. We also conduct in-domain vs. out-domain evaluation. Out-domain evaluations are particularly insightful because the distribution of the data which is being tagged by a user will typically differ from the distribution on which the tagger has been trained. Furthermore, two lemmatization techniques are evaluated. Finally, we compare pipeline tagging vs. a tagging approach that acknowledges dependencies between inflectional categories.
Year
Venue
Keywords
2016
LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION
morphological tagging,lemmatization,morphologically rich languages
Field
DocType
Citations 
Lemmatisation,Computer science,Natural language processing,Artificial intelligence,German
Conference
6
PageRank 
References 
Authors
0.52
0
3
Name
Order
Citations
PageRank
Steffen Eger17725.00
Rüdiger Gleim2396.27
Alexander Mehler318636.63