Title
Document Image Coding for Processing and Retrieval
Abstract
Document images belong to a unique class of images where theinformation is embedded in the language represented by a series ofsymbols on the page rather than in the visual objectsthemselves. Since these symbols tend to appear repeatedly, adomain-specific image coding strategy can be designed to facilitateenhanced compression and retrieval. In this paper we describe a codingmethodology that not only exploits component-level redundancy toreduce code length but also supports efficient data access. Theapproach identifies and organizes symbol patterns which appearrepeatedly. Similar components are represented by a single prototypestored in a library and the location of each component instance iscoded along with the residual between it and its prototype. Arepresentation is built which provides a natural information indexallowing access to individual components. Compression results arecompetitive and compressed-domain access is superior to competingmethods. Applications to network-related problems have beenconsidered, and show promising results.
Year
DOI
Venue
1998
10.1023/A:1008074424861
VLSI Signal Processing
Keywords
Field
DocType
Document Image,Lossy Compression,Progressive Transmission,Document Image Analysis,Residual Code
Computer science,Coding (social sciences),Theoretical computer science,Redundancy (engineering),Artificial intelligence,Visual Objects,Computer vision,Residual,Pattern recognition,Information retrieval,Lossy compression,Symbol,Exploit,Data access
Journal
Volume
Issue
ISSN
20
1/2
0922-5773
Citations 
PageRank 
References 
0
0.34
11
Authors
2
Name
Order
Citations
PageRank
Omid E. Kia16611.12
David Doermann24313312.70