Title | ||
---|---|---|
Combining Classifiers to Improve Part of Speech Tagging: A Case Study for Brazilian Portuguese |
Abstract | ||
---|---|---|
Abstract. Four taggers have been trained on a 100,000-word corpus of Brazilian Portuguese, namely Unigram (Treetagger), N-gram (Treetagger), transformationbased (TBL) and Maximum-Entropy tagging (MXPOST). The latter displayed the best accuracy (88.73%), which is still much lower than the state-of-the-art accuracy for English. The low accuracy is attributed to the reduced size of the training corpus. Twelve methods of combination were used, four of which led to an improvement over the MXPOST accuracy. The best result (89.42%) was obtained with a majority-wins voting strategy. |
Year | Venue | Keywords |
---|---|---|
2000 | IBERAMIA-SBIA 2000 Open Discussion Track | improve part,combining classifiers,brazilian portuguese,case study,speech tagging,maximum entropy |
Field | DocType | ISBN |
Computer science,Part-of-speech tagging,Natural language processing,Artificial intelligence,Brazilian Portuguese | Conference | 85-87837-03-6 |
Citations | PageRank | References |
13 | 1.53 | 6 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Rachel Virgínia Xavier Aires | 1 | 13 | 1.53 |
Sandra M. Aluísio | 2 | 226 | 28.73 |
Denise Campos e Silva Kuhn | 3 | 13 | 1.53 |
Marcio Luis Barsi Andreeta | 4 | 13 | 1.53 |
Osvaldo N. Oliveira Jr. | 5 | 247 | 17.25 |