Title
Sentence Boundary Detection in Colloquial Arabic Text: A Preliminary Result
Abstract
Recently, natural language processing tasks are more frequently conducted over online content. This poses a special problem for applications over Arabic language. Online Arabic content is usually written in informal colloquial Arabic, which is characterized to be ill-structured and lacks specific linguistic standardization. In this paper, we investigate a preliminary step to conduct successful NLP processing which is the problem of sentence boundary detection. As informal Arabic lacks basic linguistic rules, we establish a list of commonly used punctuation marks after extensively studying a large amount of informal Arabic text. Moreover, we evaluated the correct usage of these punctuation marks as sentence delimiters; the result yielded a preliminary accuracy of 70%.
Year
DOI
Venue
2011
10.1109/IALP.2011.38
IALP
Keywords
Field
DocType
nlp,preliminary result,informal arabic text,colloquial arabic,online arabic content,online content,natural language processing,colloquial arabic text,natural language processing task,punctuation marks,text detection,sentence boundary detection,linguistic rules,arabic language,punctuation mark,linguistics,preliminary accuracy,sentence delimiters,informal colloquial,basic linguistic rule,informal arabic,text analysis,informal colloquial arabic text
Sentence boundary disambiguation,Arabic,Computer science,Boundary detection,Artificial intelligence,Natural language processing,Delimiter,Sentence,Standardization,Linguistics,Punctuation,Modern Arabic mathematical notation
Conference
ISBN
Citations 
PageRank 
978-1-4577-1733-8
3
0.39
References 
Authors
2
3
Name
Order
Citations
PageRank
Afnan A. Al-Subaihin1766.02
Hend S. Al-Khalifa227651.73
AbdulMalik S. Al-Salman314118.35