Title
The extraction of text/graphs from degraded documents
Abstract
This paper presents a method for improving the quality of degraded documents by noise removal and text enhancing. Histogram of a degraded document is analyzed to find out the approximate ranges of gray-value for text-, graph-, (i.e. photographs), and background-pixels. After the graph-pixels are identified, they are replaced by the background pixels. Agent-growing method described by S. H. Yen and M. C. Shih (2000) is then applied to smooth the noisy background and a document with clear readable condition for text and background is obtained. At last, graph pixels are recovered to get the final result such that the degraded document now has the text in much better quality and photographs preserved if there is any. Experiments to verify the efficacy of the proposed method and comparison to some existing techniques are also presented.
Year
DOI
Venue
2004
10.1109/MULMM.2004.1264984
MMM
Keywords
Field
DocType
thenoisy background,existing technique,gray-value,noise,noise removal,background pixel,character recognition,degraded documents,clear readablecondition,histogram,background pixels,feature extraction,degraded document,graph pixels,final result,text analysis,photograph pixels,software agents,text enhancing,quality ofdegraded document,thedegraded document,agent-growing method
Histogram,Computer vision,Graph,Text mining,Pattern recognition,Character recognition,Computer science,Software agent,Feature extraction,Pixel,Artificial intelligence,Noise removal
Conference
ISBN
Citations 
PageRank 
0-7695-2084-7
0
0.34
References 
Authors
8
4
Name
Order
Citations
PageRank
Shwu-huey Yen1429.07
Yi-Jin Chen2133.46
Hui-Jen Lin300.34
Chia-Jen Wang4173.25