Title
Grammar-based tools for the creation of tagging resources for an unresourced language: the case of Northern Sotho
Abstract
We describe an architecture for the parallel construction of a tagger lexicon and an annotated reference corpus for the part-of-speech tagging of Nothern Sotho, a Bantu language of South Africa, for which no tagged resources have been available so far. Our tools make use of grammatical properties (morphological and syntactic) of the language. We use symbolic pretagging, followed by stochastic tagging, an architecture which proves useful not only for the bootstrapping of tagging resources, but also for the taggi ng of any new text. We discuss the tagset design, the tool architecture and the current state of our ongoing effort.
Year
Venue
Field
2006
LREC
Architecture,Bantu languages,Bootstrapping,Computer science,Grammar,Lexicon,Artificial intelligence,Natural language processing,Linguistics,Syntax
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
0
2
Name
Order
Citations
PageRank
Ulrich Heid119040.48
Danie J. Prinsloo200.34