Transfer Learning for a Letter-Ngrams to Word Decoder in the Context of Historical Handwriting Recognition with Scarce Resources. - Citegraph

Paper Info

Title
Transfer Learning for a Letter-Ngrams to Word Decoder in the Context of Historical Handwriting Recognition with Scarce Resources.

Abstract
Lack of data can be an issue when beginning a new study on historical handwritten documents. In order to deal with this, we present the character-based decoder part of a multilingual approach based on transductive transfer learning for a historical handwriting recognition task on Italian Comedy Registers. The decoder must build a sequence of characters that corresponds to a word from a vector of letter-ngrams. As learning data, we created a new dataset from untapped resources that covers the same domain and period of our Italian Comedy data, as well as resources from common domains, periods, or languages. We obtain a 97.42% Character Recognition Rate and a 86.57% Word Recognition Rate on our Italian Comedy data, despite a lexical coverage of 67% between the Italian Comedy data and the training data. These results show that an efficient system can be obtained by a carefully selecting the datasets used for the transfer learning.

Year	Venue	Field
2018	COLING	Training set,Transduction (machine learning),Scarcity,Character recognition,Computer science,Comedy,Word recognition,Transfer of learning,Handwriting recognition,Natural language processing,Artificial intelligence
DocType	Volume	Citations
Conference	C18-1	0
PageRank	References	Authors
0.34	0	5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Adeline Granet	1	0	1.69
Emmanuel Morin	2	42	16.13
Harold Mouchère	3	107	14.46
Solen Quiniou	4	71	9.97
Christian Viard-Gaudin	5	444	46.20

1