Simultaneous tokenization and part-of-speech tagging for Arabic without a morphological analyzer - Citegraph

Paper Info

Title
Simultaneous tokenization and part-of-speech tagging for Arabic without a morphological analyzer

Abstract
We describe an approach to simultaneous tokenization and part-of-speech tagging that is based on separating the closed and open-class items, and focusing on the likelihood of the possible stems of the openclass words. By encoding some basic linguistic information, the machine learning task is simplified, while achieving state-of-the-art tokenization results and competitive POS results, although with a reduced tag set and some evaluation difficulties.

Year	Venue	Keywords
2010	ACL (Short Papers)	reduced tag set,openclass word,evaluation difficulty,open-class item,simultaneous tokenization,state-of-the-art tokenization result,morphological analyzer,part-of-speech tagging,basic linguistic information,competitive pos result
Field	DocType	Volume
Rule-based machine translation,Tokenization (data security),Arabic,Lexical analysis,Computer science,Part-of-speech tagging,Speech recognition,Natural language processing,Artificial intelligence,Spectrum analyzer,Encoding (memory)	Conference	P10-2
Citations	PageRank	References
6	0.70	2
Authors
1

Authors (1 rows)

Cited by (6 rows)

References (2 rows)

Name	Order	Citations	PageRank
Seth Kulick	1	221	29.66

1