Title
A Multi-Font OCR System for Printed Telugu Text
Abstract
This work describes the design and development of a Telugu Optical Character Recognition system for printed text (TOSP).Pre-processing tasks considered in this paper are: Conversion of a grey scale image to a binary image, image rectification, skew detection and removal, segmentation of text into lines, words and basic symbols.Basic symbols are identified as the fundamental unit of segmentation in this paper which are recognized by the recognizer.The combinations of these basic symbols that together form characters and compound characters of Telugu are also determined to complete the recognition process.The special feature of TOSP is that it is designed to handle multiple sizes and multiple fonts.Further, the output produced by TOSP can directly be opened in any Indian language software that supports transliteration facility into Telugu script and edited.Several such softwares are popular and available.
Year
DOI
Venue
2002
10.1109/LEC.2002.1182284
Language Engineering Conference
Keywords
Field
DocType
multiple size,printed text,indian language software,printed telugu text,binary image,grey scale image,multiple font,telugu script,basic symbol,image rectification,telugu optical character recognition,multi-font ocr system,fundamental unit,optical character recognition,image segmentation
Segmentation,Image rectification,Computer science,Font,Binary image,Optical character recognition,Speech recognition,Image segmentation,Telugu,Transliteration
Conference
ISBN
Citations 
PageRank 
0-7695-1885-0
5
0.51
References 
Authors
4
2
Name
Order
Citations
PageRank
C. Vasantha Lakshmi1253.16
C. Patvardhan27812.28