Title
A Human-Inspired Recognition System for Pre-Modern Japanese Historical Documents
Abstract
Recognition of historical documents is a challenging problem due to the noised, damaged characters, and background. However, in Japanese historical documents, not only contains the mentioned problems, pre-modern Japanese characters were written in cursive and are connected. Therefore, character segmentation-based methods do not work well. This leads to the idea of creating a new recognition system. In this paper, we propose a human-inspired document reading system to recognize multiple lines of pre-modern Japanese historical documents. During the reading, people employ eyes movement to determine the start of a text line. Then, they move the eyes from the current character/word to the next character/word. They can also determine the end of a line or skip a figure to move to the next line. The eyes movement integrates with visual processing to operate the reading process in the brain. We employ attention-based encoder-decoder to implement this recognition system. First, the recognition system detects were to start a text line. Second, the system scans and recognize character by character until the text line is completed. Then, the system continues to detect the start of the next text line. This process is repeated until reading the whole document. As results, the system is successful to recognize multiple lines, connected and cursive characters without performing character/line segmentation. Besides, we also employ a coverage model which stores the history of eyes movement to predict the next movement more precisely. We tested our human-inspired recognition system on the pre-modern Japanese historical document provided by the PRMU Kuzushiji competition. The results of the experiments demonstrate the superiority and effectiveness of our proposed system by achieving Sequence Error Rate of 9.87% and 53.81% on level 2 and level 3 of the dataset, respectively. These results outperform to any other systems participated in the PRMU Kuzushiji competition.
Year
DOI
Venue
2019
10.1109/ACCESS.2019.2924449
IEEE ACCESS
Keywords
Field
DocType
A human reading-inspired recognition system,recognition of pre-modern Japanese historical document,attention-based encoder-decoder,Kuzushiji
Cursive,Visual processing,Recognition system,Computer science,Segmentation,Word error rate,Computer network,Artificial intelligence,Natural language processing,Historical document,Kanji
Journal
Volume
ISSN
Citations 
7
2169-3536
1
PageRank 
References 
Authors
0.48
0
3
Name
Order
Citations
PageRank
Anh Duc Le1215.70
Tarin Clanuwat210.48
Asanobu Kitamoto38415.31