Title
Multimodal Crowdsourcing for Transcribing Handwritten Documents.
Abstract
Transcription of handwritten documents is an important research topic for multiple applications, such as document classification or information extraction. In the case of historical documents, their transcription allows to preserve cultural heritage because of the amount of historical data contained in those documents. The transcription process can employ state-of-the-art handwritten text recognition systems in order to obtain an initial transcription. This transcription is usually not good enough for the quality standards, but that may speed up the final transcription of the expert. In this framework, the use of collaborative transcription applications crowdsourcing has risen in the recent years, but these platforms are mainly limited by the use of non-mobile devices. Thus, the recruiting initiatives get reduced to a smaller set of potential volunteers. In this paper, an alternative that allows the use of mobile devices is presented. The proposal consists of using speech dictation of handwritten text lines. Then, by using multimodal combination of speech and handwritten text images, a draft transcription can be obtained, presenting more quality than that obtained by only using handwritten text recognition. The speech dictation platform is implemented as a mobile device application, which allows for a wider range of population for recruiting volunteers. A real acquisition on the contents of a Spanish historical handwritten book was obtained with the platform. This data was used to perform experiments on the behaviour of the proposed framework. Some experiments were performed to study how to optimise the collaborators effort in terms of number of collaborations, including how many lines and which lines should be selected for the speech dictation.
Year
DOI
Venue
2017
10.1109/TASLP.2016.2634123
IEEE/ACM Trans. Audio, Speech & Language Processing
Keywords
Field
DocType
Speech,Crowdsourcing,Speech recognition,Reliability,Speech processing,Text recognition,Mobile handsets
Document classification,Population,Transcription (linguistics),Speech processing,Computer science,Crowdsourcing,Speech recognition,Dictation,Information extraction,Transcription (software),Artificial intelligence,Natural language processing
Journal
Volume
Issue
ISSN
25
2
2329-9290
Citations 
PageRank 
References 
1
0.36
21
Authors
2
Name
Order
Citations
PageRank
Emilio Granell1426.80
Carlos D. Martínez-Hinarejos23810.86