Title
Combining deep learning and language modeling for segmentation-free OCR from raw pixels
Abstract
We present a simple yet effective LSTM-based approach for recognizing machine-print text from raw pixels. We use a fully-connected feed-forward neural network for feature extraction over a sliding window, the output of which is directly fed into a stacked bi-directional LSTM. We train the network using the CTC objective function and use a WFST language model during recognition. Experimental results show that this simple system outperforms extensively tuned state-of-the-art HMM models on the DARPA Arabic Machine Print corpus.
Year
DOI
Venue
2017
10.1109/ASAR.2017.8067772
2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR)
Keywords
Field
DocType
DARPA Arabic Machine Print corpus,deep learning,language modeling,segmentation-free OCR,raw pixels,machine-print text,feed-forward neural network,feature extraction,sliding window,stacked bi-directional LSTM,CTC objective function,WFST language model,simple system,HMM models,LSTM-based approach
Sliding window protocol,Pattern recognition,Segmentation,Computer science,Feature extraction,Image segmentation,Speech recognition,Artificial intelligence,Deep learning,Hidden Markov model,Artificial neural network,Language model
Conference
ISBN
Citations 
PageRank 
978-1-5090-6629-2
2
0.38
References 
Authors
0
4
Name
Order
Citations
PageRank
stephen rawls1594.08
Huaigu Cao234729.09
Ekraam Sabir3152.42
Premkumar Natarajan487479.46