Title
Multi-granularity Deep Local Representations for Irregular Scene Text Recognition
Abstract
AbstractRecognizing irregular text from natural scene images is challenging due to the unconstrained appearance of text, such as curvature, orientation, and distortion. Recent recognition networks regard this task as a text sequence labeling problem and most networks capture the sequence only from a single-granularity visual representation, which to some extent limits the performance of recognition. In this article, we propose a hierarchical attention network to capture multi-granularity deep local representations for recognizing irregular scene text. It consists of several hierarchical attention blocks, and each block contains a Local Visual Representation Module (LVRM) and a Decoder Module (DM). Based on the hierarchical attention network, we propose a scene text recognition network. The extensive experiments show that our proposed network achieves the state-of-the-art performance on several benchmark datasets including IIIT-5K, SVT, CUTE, SVT-Perspective, and ICDAR datasets under shorter training time.
Year
DOI
Venue
2021
10.1145/3446971
ACM/IMS Transactions on Data Science
DocType
Volume
Issue
Journal
2
2
ISSN
Citations 
PageRank 
2691-1922
0
0.34
References 
Authors
0
6
Name
Order
Citations
PageRank
Hongchao Gao102.70
Yujia Li200.34
Jiao Dai300.34
Xi Wang401.69
Jizhong Han500.34
Ruixuan Li640569.47