Title
Fusion of Word Spotting and Spatial Information for Figure Caption Retrieval in Historical Document Images
Abstract
We present a method for figure caption detection by employing a fusion of several information sources. The evaluation is performed on documents gathered from the collection of the historical medical digital library Medic@. A method based on perceptual grouping simultaneously segments the vertical and horizontal text lines in a page. Spatial relationships between the text lines and the graphics are considered to select a set of caption line candidates. A feature-based word-spotting method is proposed to retrieve the occurrences of word images similar to a given query.Word-spotting is applied to detect the label of the captions, a word like dasiaFigpsila, dasiaFIGpsila, dasiaFigurepsila ...followed by the figure number. Combining spatial information and word recognition greatly improve the detection of caption lines. Our initial experiments process more than 300 pages from three different books.
Year
DOI
Venue
2009
10.1109/ICDAR.2009.161
ICDAR-1
Keywords
Field
DocType
horizontal text line,figure caption detection,historical medical digital library,historical document images,word image,word processing,digital libraries,spatial information,figure number,image segmentation,computer graphics,historical document image,edit distance,caption line,caption line candidate,dynamic time warping,image retrieval,word recognition,vertical text line,figure caption retrieval,feature-based wordspotting method,information source,text analysis,word spotting,spatial perception,document image processing,feature extraction,data mining,graphics,digital library,information analysis,spatial relationships,indexing,biomedical imaging,information retrieval
Edit distance,Information retrieval,Dynamic time warping,Computer science,Word recognition,Image retrieval,Search engine indexing,Feature extraction,Artificial intelligence,Natural language processing,Word processing,Historical document
Conference
ISSN
ISBN
Citations 
1520-5363 E-ISBN : 978-0-7695-3725-2
978-0-7695-3725-2
8
PageRank 
References 
Authors
0.55
11
3
Name
Order
Citations
PageRank
Khurram Khurshid112915.94
Claudie Faure213310.62
Nicole Vincent380.55