Title
Combining Classifiers to Improve Part of Speech Tagging: A Case Study for Brazilian Portuguese
Abstract
Abstract. Four taggers have been trained on a 100,000-word corpus of Brazilian Portuguese, namely Unigram (Treetagger), N-gram (Treetagger), transformationbased (TBL) and Maximum-Entropy tagging (MXPOST). The latter displayed the best accuracy (88.73%), which is still much lower than the state-of-the-art accuracy for English. The low accuracy is attributed to the reduced size of the training corpus. Twelve methods of combination were used, four of which led to an improvement over the MXPOST accuracy. The best result (89.42%) was obtained with a majority-wins voting strategy.
Year
Venue
Keywords
2000
IBERAMIA-SBIA 2000 Open Discussion Track
improve part,combining classifiers,brazilian portuguese,case study,speech tagging,maximum entropy
Field
DocType
ISBN
Computer science,Part-of-speech tagging,Natural language processing,Artificial intelligence,Brazilian Portuguese
Conference
85-87837-03-6
Citations 
PageRank 
References 
13
1.53
6
Authors
5