Title
Benchmarking Post-processing Techniques for Offline Arabic Text Recognition System.
Abstract
Automatic recognition of offline Arabic text still faces a big challenge due to the Arabic script nature. Recently, researcher's attention has been increased and variant methods had been applied in this area. This paper presents a comparative study of four OCR (Optical Character Recognition) post-processing error correction techniques. We evaluate their impact using two recognition approaches: a lexicon driven approach with and without the presence of OOV (Out Of Vocabulary) words and a lexicon free-based approach. An AOCR (Arabic Optical Character Recognition) is developed for this purpose. This system is based on HMM (Hidden Markov Model) segmentation free approach. A sliding window is performed on the line image from right to left in order to extract the oriented gradient histogram (HOG) features. Experiments are carried out on KAFD database using different scenarios and revealed a significant improvement in OCR error correction rate.
Year
DOI
Venue
2016
10.1007/978-3-319-52941-7_27
PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON HYBRID INTELLIGENT SYSTEMS (HIS 2016)
Keywords
Field
DocType
AOCR,HMM,Lexicon,Language model,Post-processing,Sequence alignment
Histogram,Sliding window protocol,Computer science,Optical character recognition,Speech recognition,Lexicon,Natural language processing,Artificial intelligence,Hidden Markov model,Right-to-left,Language model,Arabic script
Conference
Volume
ISSN
Citations 
552
2194-5357
0
PageRank 
References 
Authors
0.34
0
3
Name
Order
Citations
PageRank
Sana Khamekhem Jemni111.72
Yousri Kessentini200.34
Slim Kanoun320920.14