Error Analysis In Croatian Morphosyntactic Tagging - Citegraph

Paper Info

Title
Error Analysis In Croatian Morphosyntactic Tagging

Abstract
In this paper, we provide detailed insight on properties of errors generated by a stochastic morphosyntactic tagger assigning Multext-East morphosyntactic descriptions to Croatian texts. Tagging the Croatia Weekly newspaper corpus by the CroTag tagger in stochastic mode revealed that approximately 85 percent of all tagging errors occur on nouns, adjectives, pronouns and verbs. Moreover, approximately 50 percent of these are shown to be incorrect assignments of case values. We provide various other distributional properties of errors in assigning morphosyntactic descriptions for these and other parts of speech. On the basis of these properties, we propose rule-based and stochastic strategies which could be integrated in the tagging module, creating a hybrid procedure in order to raise overall tagging accuracy for Croatian.

Year	DOI	Venue
2009	10.1109/ITI.2009.5196140	PROCEEDINGS OF THE ITI 2009 31ST INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY INTERFACES
Keywords	Field	DocType
Morphosyntactic tagging, part-of-speech tagging, error analysis, error distribution, Croatian language, hybrid tagging	Computer science,Noun,Part-of-speech tagging,Knowledge-based systems,Stochastic process,Speech recognition,Part of speech,Natural language,Natural language processing,Artificial intelligence,Croatian,Hidden Markov model	Conference
ISSN	Citations	PageRank
1330-1012	1	0.38
References	Authors
1	3

Authors (3 rows)

Cited by (1 rows)

References (1 rows)

Name	Order	Citations	PageRank
Zeljko Agic	1	159	20.44
Marko Tadić	2	80	15.61
Zdravko Dovedan	3	19	2.80

1