Title
A Human in the Loop Approach to Historical Handwritten Documents Transcription
Abstract
We propose a novel approach for helping content transcription of handwritten digital documents. The approach adopts a segmentation based keyword retrieval approach that follows query-by-string paradigm and exploits the user validation of the retrieved words to improve its performance during operation. Our approach starts with an initial training set, which contains only a few pages and a tentative list of words supposedly in the document, and iteratively interleaves a word retrieval step by the system with a validation step by the user. After each iteration, the system exploits the results of the validation to update its internal model, so as to use that evidence in further iterations of the search. Experimental results on the Bentham dataset show that the system may start with a few word images and their transcripts, exhibits an improvement of the performance during operation, and after a few iterations is able to correctly transcribe more than 68% of the word of the list.
Year
DOI
Venue
2016
10.1109/ICFHR.2016.0051
2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR)
Keywords
Field
DocType
Historical handwritten documents,human in the loop,word retrieval
Pattern recognition,Segmentation,Computer science,Knowledge-based systems,Handwriting recognition,Image segmentation,Exploit,Artificial intelligence,Human-in-the-loop,Hidden Markov model,Machine learning,Internal model
Conference
ISSN
ISBN
Citations 
2167-6445
978-1-5090-0982-4
0
PageRank 
References 
Authors
0.34
5
3
Name
Order
Citations
PageRank
Adolfo Santoro172.72
Antonio Parziale2255.66
Angelo Marcelli313932.42