Title
Towards the Machine Reading of Arabic Calligraphy: A Letters Dataset and Corresponding Corpus of Text
Abstract
Arabic calligraphy is one of the great art forms of the world. It displays Arabic phrases, commonly taken from the Holy Quran, in beautiful two-dimensional form. The use of two dimensions, and the interweaving of letters and words makes reading a far greater challenge for Artificial Intelligence (AI) than reading standard printed or hand-written Arabic. To approach this challenge, we have constructed a dataset of Arabic calligraphic letters, along with a corresponding corpus of phrases and quotes. The letters dataset contains a total of 3,467 images for 32 various categories of Arabic calligraphic-type letters. The associated text corpus contains 544 unique quoted phrases. These data were collected from various open sources on the web, and include examples from several Arabic calligraphic styles. We have also undertaken both an explorative statistical analysis of this data, and initial machine learning investigations. These analyses suggest that combining knowledge of a limited variety of Arabic calligraphy texts, with a successful machine will be sufficient for the machine reading of forms of Arabic calligraphy.
Year
DOI
Venue
2018
10.1109/ASAR.2018.8480228
2018 IEEE 2nd International Workshop on Arabic and Derived Script Analysis and Recognition (ASAR)
Keywords
DocType
ISBN
Arabic language,corpora,pattern recognition,Arabic dataset,calligraphy
Conference
978-1-5386-1460-0
Citations 
PageRank 
References 
0
0.34
0
Authors
2
Name
Order
Citations
PageRank
Seetah A. L. Salamah100.34
Ross D. King21774194.85