Title
An OCR-Independent Character Segmentation Using Shortest-Path in Grayscale Document Images
Abstract
An Optical Character Recognition (OCR) system with a high recognition rate is challenging to develop. One of the major contributors to OCR errors is smeared characters. Several factors lead to the smearing of characters such as bad scanning quality and a poor binarization technique. Typical approaches to character segmentation falls into three major categories: image-based, recognition-based, and holistic-based. Among these approaches, the segmentation path can be linear or non-linear. Our paper proposes a non-linear approach to segment characters on grayscale document images. Our method first determines whether characters are smeared together using general character features. The correct segmentation path is found using a shortest path approach. We achieved a segmentation accuracy of 95% over a set of about 2,000 smeared characters.
Year
DOI
Venue
2007
10.1109/ICMLA.2007.18
ICMLA
Keywords
Field
DocType
optical character recognition,image segmentation,shortest path
Computer vision,Scale-space segmentation,Pattern recognition,Shortest path problem,Document image processing,Computer science,Segmentation,Optical character recognition,Segmentation-based object categorization,Image segmentation,Artificial intelligence,Grayscale
Conference
ISBN
Citations 
PageRank 
0-7695-3069-9
7
0.57
References 
Authors
14
4
Name
Order
Citations
PageRank
Jia Tse181.62
Christopher Jones27020.25
Dean Curtis3102.36
Evangelos A. Yfantis41610.80