Title
Authorship Attribution of Arabic Articles.
Abstract
With the huge size and large diversity of web content and the appearance of more social media platforms and blog websites, more people are contributing content of varying quality. Many users prefer to keep themselves anonymous when posting material to the web, which resulted in more pieces of text: articles, blogs, essays and emails being published under assumed identities or have no known author. This may result in copyright and other legal issues and thus the need for good authorship attribution systems. The problem may be more acute for Arabic texts due to restrictions, actual and perceived, on electronic content publication and the prevailing social norms. In this paper we study the issue of Arabic author attribution (AAA) concerned with designating a particular author of an Arabic (MSA) article from among a given set of potential authors. Many features were taken into consideration for training and testing our models for AAA. We studied the effects of features like part of speech (PoS) tags, stylistic issues like punctuation marks usage and sentence characteristics, word types and word diversity. In general, PoS features, word n-grams features and rare words proved to be the most informative for our task. We also investigated the effect of factors like number of potential authors, number of articles per author, and the size of text chunks used and we report on the results.
Year
DOI
Venue
2019
10.1007/978-3-030-32959-4_14
Communications in Computer and Information Science
Keywords
DocType
Volume
Arabic authorship attribution,Arabic plagiarism detection,Writing style recognition,Arabic special features,Arabic text author identification
Conference
1108
ISSN
Citations 
PageRank 
1865-0929
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Maha Hajja100.34
Ahmad Yahya200.34
Adnan Yahya3684.77