Title
Transcription Alignment for Highly Fragmentary Historical Manuscripts: The Dead Sea Scrolls
Abstract
Most of the Dead Sea Scrolls have now been digitally transcribed and imaged to very high standards. Our goal is to align the transcriptions with the text visible in the image, glyph by (often fragmentary) glyph. This involves several tasks, normally considered in isolation: (A) Baseline segmentation. (B) Line polygon extraction. (C) Automated transcription by handwritten character recognition, to aid in alignment. (D) Alignment of the Unicode characters in a line transcription with the characters in the image of that line. The task is frustrated by the degraded nature of the frequently very small and/or warped fragments with many broken letters, substantially different allographs, ligatures, and scribal idiosyncrasies. Furthermore, a great number of inconsistencies between current cataloguing systems for the data need to be resolved. For each task, we apply state-of-the-art machine-learning methods in addition to more traditional techniques, each presenting significant difficulties on account of the poor state of most fragments' preservation. We have built ground-truth datasets and have managed to achieve good results with well-preserved fragments by leveraging heavily augmented transfer learning from prior work with medieval manuscripts.
Year
DOI
Venue
2020
10.1109/ICFHR2020.2020.00072
2020 17th International Conference on Frontiers in Handwriting Recognition (ICFHR)
Keywords
DocType
ISSN
historical manuscripts,transcription alignment,image segmentation
Conference
2167-6445
ISBN
Citations 
PageRank 
978-1-7281-9967-2
0
0.34
References 
Authors
10
5