Title
A Multimodal Crowdsourcing Framework for Transcribing Historical Handwritten Documents.
Abstract
Transcription of handwritten historical documents is one of the main topics in document analysis systems, due to cultural reasons. State-of-the-art handwritten text recognition systems allow to speed up the transcription task. Currently, this automatic transcription is far from perfect, and human expert revision is required in order to obtain the actual transcription. In this context, crowdsourcing emerged as a powerful tool for massive transcription at a relatively low cost, since the supervision effort of professional transcribers may be dramatically reduced. However, current transcription crowdsourcing platforms are mainly limited to the use of non-mobile devices, since the use of keyboards in mobile devices is not friendly enough for most users. This work presents the alternative of using speech dictation of handwritten text lines as transcription source in a crowdsourcing platform. The experiments explore how an initial handwritten text recognition hypothesis can be improved by using the contribution of speech recognition from several speakers, providing as a final result a better hypothesis to be amended by a professional transcriber with less effort.
Year
DOI
Venue
2016
10.1145/2960811.2960815
DocEng
Field
DocType
Citations 
Transcription (linguistics),Document analysis,Crowdsourcing,Computer science,Transcription (software),Dictation,Natural language processing,Artificial intelligence,Information retrieval,Mobile device,Cultural reasons,Database,Text recognition
Conference
2
PageRank 
References 
Authors
0.38
14
2
Name
Order
Citations
PageRank
Emilio Granell1426.80
Carlos D. Martínez-Hinarejos23810.86