Title
Word-Based Adaptive OCR for Historical Books
Abstract
The aim of this work is to propose a new approach to the recognition of historical texts by providing an adaptive mechanism that automatically tunes itself to a specific book. The system is based on clustering together all the similar words in a book/text and simultaneously handling entire class. The paper describes the architecture of such a system and new algorithms that have been developed for robust word image comparison (including registration, optical flow based distortion compensation, and adaptive binarization). Results for a large dataset are presented as well. Over 23% recognition improvement is demonstrated.
Year
DOI
Venue
2009
10.1109/ICDAR.2009.133
ICDAR-1
Keywords
Field
DocType
historical text,entire class,historical books,specific book,recognition improvement,adaptive mechanism,new algorithm,adaptive binarization,distortion compensation,new approach,large dataset,word-based adaptive ocr,optical imaging,shape,engines,optical flow,document processing,text analysis,history,electronic publishing,optical character recognition,image recognition
Computer science,Artificial intelligence,Natural language processing,Cluster analysis,Distortion,Word processing,Computer vision,Architecture,Document processing,Optical character recognition,Speech recognition,Optical flow,Electronic publishing
Conference
Citations 
PageRank 
References 
12
0.78
11
Authors
5
Name
Order
Citations
PageRank
Vladimir Kluzner1141.82
Asaf Tzadok2262.91
Yuval Shimony3120.78
Eugene Walach410011.65
Apostolos Antonacopoulos537836.45