Title
Two Stream Deep Neural Network For Sequence-Based Urdu Ligature Recognition
Abstract
Urdu text is a complex cursive script and poses a challenge for recognition by OCR systems due to its large number of ligatures and cursive style. In literature, several techniques have been proposed to recognize Urdu ligatures. However, we have investigated that, suitable challenging datasets and the consequently higher recognition rate is needed for ligature recognition. In this paper, a hybrid model based on the holistic approach is adopted for the recognition of Urdu ligatures (compound characters). More than 3800 unique ligatures were used to generate 46K (38K training, 7K testing) synthetic ligatures with 9 different kinds of transformations along with the normal ligatures. Each ligature is processed through two streams of Deep Neural Networks, namely Alexnet and Vgg16 to obtain a unique set of features corresponding to each net. These features are fused and then used as an input to double layer Bidirectional Long Short Term (BLSTM) network for learning a model. The learned model maps ligature images to their corresponding sequence of individual Urdu characters. In the proposed methodology output is in the editable Urdu-script format. The proposed model was evaluated and have shown an accuracy of 97 on the training dataset and 80 on more than 7K parametrically different query ligatures (test-set).
Year
DOI
Venue
2019
10.1109/ACCESS.2019.2950537
IEEE ACCESS
Keywords
DocType
Volume
BLSTM, classification, deep neural network, Nastalique, optical character recognition (OCR), synthetic Urdu text
Journal
7
ISSN
Citations 
PageRank 
2169-3536
1
0.37
References 
Authors
0
2
Name
Order
Citations
PageRank
Syed Yasser Arafat110.37
Muhammad Javed Iqbal252.16