Title
CATNet: Scene Text Recognition Guided by Concatenating Augmented Text Features
Abstract
In this paper, we propose an end-to-end trainable text recognition model that consists of an auxiliary augmentation module and a text recognizer. Well-established traditional methods in computer vision (such as binarization, sharpening and morphological operations) play critical roles in image preprocessing. These operations are proven to be particularly useful for downstream computer vision tasks. In order to achieve better results, case-by-case hyperparameter adjustment is often required. Inspired by traditional CV methods, we propose an auxiliary network to mimic traditional CV operations. The auxiliary network acts like an image preprocessing module to extract rich augmented features from the input image to ease the downstream recognition difficulty. We studied three types of augmentation modules with parameters that can be learned directly via gradient back-propagation. This way, our method combines traditional CV techniques and deep neural network by joint learning. The proposed method is extensively tested on major benchmark datasets to show that it can boost the performance of the recognizers, especially for degraded text images in various challenging conditions.
Year
DOI
Venue
2021
10.1007/978-3-030-86549-8_23
DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2021, PT I
Keywords
DocType
Volume
Scene text recognition, Feature learning, Hybrid techniques
Conference
12821
ISSN
Citations 
PageRank 
0302-9743
0
0.34
References 
Authors
0
5
Name
Order
Citations
PageRank
Ziyin Zhang100.34
Lemeng Pan200.34
Lin Du301.35
Qingrui Li400.34
Ning Lu500.34