Title
Textual indexation of ancient documents
Abstract
In the past years many levels of indexation have been developped to allow a fast retrieval of digitized documents. Among all the ways of indexing a document, textual indexation allows the finest querries on a the documents' content. Usually, the plain text transcription of a digitized document is obtained by applying an OCR (Optical Character Recognition) software on it. What if the OCR fails? Indeed OCR systems are inefficient on low-quality printed documents, and are unsuited to the processing of ancient fonts. Furthermore, OCR is not applicable to manuscript text recognition. In this paper we introduce two alternative methods of accessing to text trough the image: the Computer Assisted Transcription and the Word Spotting.
Year
DOI
Venue
2005
10.1145/1096601.1096630
ACM Symposium on Document Engineering
Keywords
Field
DocType
computer assisted transcription,low-quality printed document,plain text transcription,ancient document,optical character recognition,text trough,manuscript text recognition,word spotting,ocr system,digitized document,textual indexation,indexation
Indexation,Information retrieval,Computer science,Document processing,Search engine indexing,Optical character recognition,Software,Plain text,Schema matching,Spotting
Conference
ISBN
Citations 
PageRank 
1-59593-240-2
12
0.79
References 
Authors
2
3
Name
Order
Citations
PageRank
Yann Leydier117410.13
Frank Lebourgeois225623.94
Hubert Emptoz338338.09