Title
Large Scale Parallel Document Image Processing
Abstract
Building a system which allows to search a very large database of document images. requires professionalization of hardware and software, e-science and web access. In astrophysics there is ample experience dealing with large data sets due to an increasing number of measurement instruments. The problem of digitization of historical documents of the Dutch cultural heritage is a similar problem. This paper discusses the. use of a system developed at the Kapteyn Institute of Astrophysics for the processing of large data sets, applied to the problem of creating a very large searchable archive of connected cursive handwritten texts. The system is adapted to the specific needs of processing document images. It shows that interdisciplinary collaboration can be beneficial in the context of machine learning, data processing and professionalization of image processing and retrieval systems.
Year
DOI
Venue
2008
10.1117/12.765482
DOCUMENT RECOGNITION AND RETRIEVAL XV
Keywords
Field
DocType
handwriting recognition, image retrieval, supercomputing, pattern recognition, E-science
Computer vision,Data processing,Cursive,Digitization,Information retrieval,e-Science,Computer science,Image processing,Very large database,Image retrieval,Software,Artificial intelligence
Conference
Volume
ISSN
Citations 
6815
0277-786X
1
PageRank 
References 
Authors
0.39
3
3
Name
Order
Citations
PageRank
Tijn van der Zant11229.70
Lambert Schomaker Member2130987.50
Edwin Valentijn3235.80