Title
An Adaptive Segmentation Technique For the Ancient Ethiopian Ge’ez Language Digital Manuscripts
Abstract
The study of ancient Ethiopian Ge’ez language is essential to understanding the history of Ethiopia and the evolution and modern usage of the Roman alphabet. By the 10 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">th</sup> century AD, ancient Ge’ez ceased to exist as a spoken language in Ethiopia. Spoken Ge’ez is split into many closely related tongues, mainly Tigirina in the North and Amharic in the South. However, written Ge’ez was kept firmly in use purely for sacred and scholarly endeavours, from the 13 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">th</sup> to the 17 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">th</sup> centuries, which is known as the “classical period” of Ethiopian literature. Ancient documents have great benefits for the modern world beyond cultural heritage. In recent years the digital archiving of ancient document is greatly expanding across the globe where ancient Ethiopian manuscripts written in Ge’ez language are beginning to appear in digital libraries and on the web. However, most of the documents are stored as raw images and they are not suitable for document processing and indexing. This reduces the usage of Ge’ez document in many research fields. Hence, there is a need for a recognition model that coverts raw images into machine encoded format. In the process of developing such a model, the document should be segmented into individual characters.The noise in old documents usually reduces the performance of many segmentation algorithms. This paper presents an image segmentation technique for the old Ge’ez handwritten documents. This segmentation technique outperforms the widely used watershed algorithm by 18.6% in terms of accuracy of segmentation. This will form part of the overall system for automatic optical character recognition for ancient Ge’ez handwritten documents.
Year
DOI
Venue
2018
10.1109/CEEC.2018.8674218
2018 10th Computer Science and Electronic Engineering (CEEC)
Keywords
DocType
ISBN
Image segmentation,Approximation algorithms,Image edge detection,Libraries,Optical character recognition software,Character recognition,History
Conference
978-1-5386-7275-4
Citations 
PageRank 
References 
0
0.34
0
Authors
2
Name
Order
Citations
PageRank
Daniel Mahetot Kassa100.34
Hani Hagras21747129.26