Abstract | ||
---|---|---|
Recognizing text from natural images is a challenging and hot research topic in computer vision, yet not completely solved. The recent methods regard this task as a sequence labeling problem. In this task, there is a strong correspondence between the position of the input image patches sequence and the output character sequence. However, most of the recent recognition systems rarely consider this local information of the input sequence when recognizing the current character. In contrast to this, we present a Local Restricted Attention (LRA) mechanism to encode the current vector by considering adjacent vectors of the input sequence. We propose an ensemble decoder block which combines LRA mechanism with a regular decoder mechanism. This block not only brings significant improvement of recognition results under shorter training time but also can be easily embedded in other recognition frameworks. In addition, we propose a scene text recognition network based on the ensemble decoder. The experimental performances show that the proposed model achieves the state-of-the-art on several benchmark datasets including IIIT-5K, SVT, CUTE80, SVT-Perspective and ICDARs. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/IJCNN.2019.8852010 | 2019 International Joint Conference on Neural Networks (IJCNN) |
Keywords | Field | DocType |
Deep Neural Networks,Scene Text Recognition,Attention Mechanism | ENCODE,Sequence labeling,Pattern recognition,Computer science,Artificial intelligence,Deep neural networks,Text recognition | Conference |
ISSN | ISBN | Citations |
2161-4393 | 978-1-7281-1986-1 | 0 |
PageRank | References | Authors |
0.34 | 11 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Hongchao Gao | 1 | 0 | 2.70 |
Yujia Li | 2 | 0 | 0.34 |
Xi Wang | 3 | 0 | 1.69 |
Jizhong Han | 4 | 355 | 54.72 |
Ruixuan Li | 5 | 405 | 69.47 |