Abstract | ||
---|---|---|
Previous work on part-of-speech (PoS) tagging Icelandic has shown that the mor- phological complexity of the language poses considerable difficulties for PoS tag- gers. In this paper, we increase the tagg- ing accuracy of Icelandic text by using two methods. First, we present a new tagger, by integrating an HMM tagger into a lin- guistic rule-based tagger. Our tagger ob- tains state-of-the-art tagging accuracy of 92.31% using the standard test set derived from the IFD corpus, and 92.51% using a corrected version of the corpus. Second, we design an external tagset, by removing information from the internal tagset which reflects distinctions that are not morpho- logically based. Using the external tagset for evaluation, the tagging accuracy fur- ther increases to 93.63%. |
Year | Venue | Keywords |
---|---|---|
2009 | NODALIDA | rule based,part of speech |
Field | DocType | Citations |
Computer science,Speech recognition,Natural language processing,Artificial intelligence,Hidden Markov model,Test set,Icelandic | Conference | 6 |
PageRank | References | Authors |
0.59 | 15 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
hrafn loftsson | 1 | 112 | 12.57 |
Ida Kramarczyk | 2 | 6 | 0.59 |
sigrun helgadottir | 3 | 71 | 7.61 |
Eiríkur Rögnvaldsson | 4 | 94 | 12.12 |