Title
Design and Development of Part-of-Speech-Tagging Resources for Wolof (Niger-Congo, spoken in Senegal)
Abstract
In this paper, we report on the design of a part-of-speech-tagset for Wolof and on the creation of a semi-automatically annotated gold standard. The main motivation for this resource is to obtain data for training automatic taggers with machine learning approaches. Hence, we take machine learning considerations into account during tagset design and present training experiments as part of t his paper. The best automatic tagger achieves an accuracy of 95.2% in cross-validation experiments. We also wanted to create a basis for experimenting with annotation projection techniques, which exploit parallel corpora. For this reason, it was useful to use a part of th e Bible as the gold standard corpus, for which sentence-aligned parallel versions in many languages are easy to obtain.
Year
Venue
Keywords
2010
LREC
machine learning,cross validation,part of speech,gold standard
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
1
3
Name
Order
Citations
PageRank
Cheikh M. Bamba Dione151.90
Jonas Kuhn2348.90
Sina Zarrieß3358.65