Title
Improving the PoS tagging accuracy of Icelandic text
Abstract
Previous work on part-of-speech (PoS) tagging Icelandic has shown that the mor- phological complexity of the language poses considerable difficulties for PoS tag- gers. In this paper, we increase the tagg- ing accuracy of Icelandic text by using two methods. First, we present a new tagger, by integrating an HMM tagger into a lin- guistic rule-based tagger. Our tagger ob- tains state-of-the-art tagging accuracy of 92.31% using the standard test set derived from the IFD corpus, and 92.51% using a corrected version of the corpus. Second, we design an external tagset, by removing information from the internal tagset which reflects distinctions that are not morpho- logically based. Using the external tagset for evaluation, the tagging accuracy fur- ther increases to 93.63%.
Year
Venue
Keywords
2009
NODALIDA
rule based,part of speech
Field
DocType
Citations 
Computer science,Speech recognition,Natural language processing,Artificial intelligence,Hidden Markov model,Test set,Icelandic
Conference
6
PageRank 
References 
Authors
0.59
15
4
Name
Order
Citations
PageRank
hrafn loftsson111212.57
Ida Kramarczyk260.59
sigrun helgadottir3717.61
Eiríkur Rögnvaldsson49412.12