Title
Reducing the Human Effort in Text Line Segmentation for Historical Documents
Abstract
Labeling the layout in historical documents for preparing training data for machine learning techniques is an arduous task that requires great human effort. A draft of the layout can be obtained by using a document layout analysis (DLA) system that later can be corrected by the user with less effort than doing it from scratch. We research in this paper an iterative process in which the user only supervises and corrects the given draft for the pages automatically selected by the DLA system with the aim of reducing the required human effort. The results obtained show that similar DLA quality can be achieved by reducing the number of pages that the user has to annote and that the accumulated human effort required to obtain the layout of the pages used to train the models can be reduced more than 95%.
Year
DOI
Venue
2021
10.1007/978-3-030-86334-0_34
DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT III
Keywords
DocType
Volume
Document layout analysis, Text line segmentation, Human effort reduction, Historical document
Conference
12823
ISSN
Citations 
PageRank 
0302-9743
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Emilio Granell1426.80
Lorenzo Quirós200.34
Verónica Romero325928.31
Joan-Andreu Sánchez419829.00