Title
Part-of-Speech Tagging Using Word Probability Based on Category Patterns
Abstract
This paper focuses on part-of-speech (POS, category) tagging based on word probability estimated using morpheme unigrams and category patterns within a word. The word-N-gram-based POS-tagging model is difficult to adapt to agglutinative languages such as Korean, Turkish and Hungarian, among others, due to the high productivity of words. Thus, many of the stochastic studies on Korean POS-tagging have been conducted based on morpheme N-grams. However, the morpheme-N-gram model also has difficulty coping with data sparseness when augmenting contextual information in order to assure sufficient performance. In addition, the model has difficulty conceiving the relationship of morphemes within a word. The present POS-tagging algorithm (a) resolves the data-sparseness problem thanks to a morpheme-unigram-based approach and (b) involves the relationship of morphemes within a word by estimating the weight of the category of a morpheme in a category pattern constituting a word. With the proposed model, a performance similar to that with other models that use more than just the morpheme-unigram model was observed.
Year
DOI
Venue
2007
10.1007/978-3-540-70939-8_11
CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
Keywords
Field
DocType
morpheme unigrams,present pos-tagging algorithm,word probability,word-n-gram-based pos-tagging model,morpheme n-grams,korean pos-tagging,morpheme-unigram model,morpheme-n-gram model,category pattern,category patterns,part-of-speech tagging,part of speech
Morpheme,Difficulty coping,Contextual information,Turkish,Isolating language,Computer science,Agglutinative language,Part-of-speech tagging,Speech recognition,Natural language processing,Artificial intelligence
Conference
Volume
ISSN
Citations 
4394
0302-9743
0
PageRank 
References 
Authors
0.34
8
4
Name
Order
Citations
PageRank
Mi-Young Kang14011.87
Sungwon Jung232059.65
Kyung-Soon Park300.34
Hyuk-Chul Kwon413629.02