An Adaptive Text-Line Extraction Algorithm For Printed Arabic Documents With Diacritics - Citegraph

Paper Info

Title
An Adaptive Text-Line Extraction Algorithm For Printed Arabic Documents With Diacritics

Abstract
The performance of document text recognition depends on text line segmentation algorithms, which heavily relies on the type of language, author's writing style, pen type, and document quality. In this paper, we present a novel unsupervised text-line segmentation algorithm for printed Arabic documents with and without diacritics. The presented approach employs a projection profile along with connected components in an iterative manner to detect text-lines. The primary benefits of the presented algorithm are (i) it is not threshold dependent, (ii) it is not required a training phase for threshold selection, and (iii) it is robust towards page rotation, font type, size, and style variation for both with and without diacritics documents. The extensive computational simulations on manually collected dataset prove the efficiency of the proposed scheme compared with several baseline and states of the art methods, including, Voronoi, X-Y Cut, Docstrum, Smearing and Seam-carving methods. Computational time analysis also presented.

Year	DOI	Venue
2021	10.1007/s11042-020-09737-1	MULTIMEDIA TOOLS AND APPLICATIONS
Keywords	DocType	Volume
Arabic character recognition, Line segmentation, Baseline, Diacritics	Journal	80
Issue	ISSN	Citations
2	1380-7501	0
PageRank	References	Authors
0.34	0	5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Khader Mohammad	1	13	5.22
Aziz Qaroush	2	10	6.86
Mahdi Washha	3	0	0.34
Sos S. Agaian	4	744	83.01
Iyad Tumar	5	22	4.77

1