Title
Transfer Learning for a Letter-Ngrams to Word Decoder in the Context of Historical Handwriting Recognition with Scarce Resources.
Abstract
Lack of data can be an issue when beginning a new study on historical handwritten documents. In order to deal with this, we present the character-based decoder part of a multilingual approach based on transductive transfer learning for a historical handwriting recognition task on Italian Comedy Registers. The decoder must build a sequence of characters that corresponds to a word from a vector of letter-ngrams. As learning data, we created a new dataset from untapped resources that covers the same domain and period of our Italian Comedy data, as well as resources from common domains, periods, or languages. We obtain a 97.42% Character Recognition Rate and a 86.57% Word Recognition Rate on our Italian Comedy data, despite a lexical coverage of 67% between the Italian Comedy data and the training data. These results show that an efficient system can be obtained by a carefully selecting the datasets used for the transfer learning.
Year
Venue
Field
2018
COLING
Training set,Transduction (machine learning),Scarcity,Character recognition,Computer science,Comedy,Word recognition,Transfer of learning,Handwriting recognition,Natural language processing,Artificial intelligence
DocType
Volume
Citations 
Conference
C18-1
0
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
Adeline Granet101.69
Emmanuel Morin24216.13
Harold Mouchère310714.46
Solen Quiniou4719.97
Christian Viard-Gaudin544446.20