Title
Combination Of Deep Neural Networks And Logical Rules For Record Segmentation In Historical Handwritten Registers Using Few Examples
Abstract
This work focuses on the layout analysis of historical handwritten registers, in which local religious ceremonies were recorded. The aim of this work is to delimit each record using few available training data. To this end, two approaches are proposed. Firstly, three state-of-the-art object detection networks are explored and compared. Further experiments are then conducted on Mask R-CNN, as it yields the best performance. Secondly, we introduce and investigate Deep&Syntax, a hybrid system that takes advantages of recurrent patterns to delimit each record, by combining u-shaped networks and logical rules. Finally, these two approaches are evaluated on 3708 French records (sixteenth-eighteenth centuries), as well as on the Esposalles public database, containing 253 Spanish records (seventeenth century). While both systems perform well on homogeneous documents, we observe a significant drop in performance with Mask R-CNN on more challenging documents, especially when trained on a small, non-representative subset. By contrast, Deep&Syntax relies on steady patterns and is therefore able to process a wider range of documents with less training data. When both systems are trained on 120 documents, Deep&Syntax produces 15% more match configurations and reduces the ZoneMap surface error metric by 30%. It also outperforms Mask R-CNN when trained on a database three times smaller. As Deep&Syntax generalizes better, we believe it can be used for massive parish register processing, as collecting and annotating a sufficiently large and representative set of training data is not always achievable.
Year
DOI
Venue
2021
10.1007/s10032-021-00362-8
INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION
Keywords
DocType
Volume
Historical handwritten documents, Deep neural networks, Hybrid systems, Layout analysis
Journal
24
Issue
ISSN
Citations 
1-2
1433-2833
0
PageRank 
References 
Authors
0.34
0
4
Name
Order
Citations
PageRank
Solène Tarride100.34
Aurélie Lemaitre2639.41
Bertrand Coüasnon316919.22
Sophie Tardivel400.68