Multimodal Output Combination For Transcribing Historical Handwritten Documents - Citegraph

Paper Info

Title
Multimodal Output Combination For Transcribing Historical Handwritten Documents

Abstract
Transcription of digitalised historical documents is an interesting task in the document analysis area. This transcription can be achieved by using Handwritten Text Recognition (HTR) on digitalised pages or by using Automatic Speech Recognition (ASR) on the dictation of contents. Moreover, another option is using both systems in a multi-modal combination to obtain a draft transcription, given that combining the outputs of different recognition systems will generally improve the recognition accuracy. In this work, we present a new combination method based on Confusion Network. We check its effectiveness for transcribing a Spanish historical book. Results on both unimodal combination with different optical (for HTR) and acoustic (for ASR) models, and multi-modal combination, show a relative reduction of Word and Character Error Rate of 14.3% and 16.6%, respectively, over the HTR baseline.

Year	DOI	Venue
2015	10.1007/978-3-319-23192-1_21	COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2015, PT I
Keywords	Field	DocType
Document analysis and transcription, Handwritten text recognition, Automatic speech recognition, Confusion Networks combination, Recognition outputs combination	Transcription (linguistics),Confusion,Document analysis,Pattern recognition,Computer science,Word error rate,Speech recognition,Dictation,Natural language processing,Artificial intelligence,Text recognition,Intelligent word recognition	Conference
Volume	ISSN	Citations
9256	0302-9743	3
PageRank	References	Authors
0.41	4	2

Authors (2 rows)

Cited by (3 rows)

References (4 rows)

Name	Order	Citations	PageRank
Emilio Granell	1	42	6.80
Carlos D. Martínez-Hinarejos	2	38	10.86

1