SAM: Self Attention Mechanism for Scene Text Recognition Based on Swin Transformer - Citegraph

Paper Info

Title
SAM: Self Attention Mechanism for Scene Text Recognition Based on Swin Transformer

Abstract
Scene text recognition, which detects and recognizes the text in the image, has engaged extensive research interest. Attention mechanism based methods for scene text recognition have achieved competitive performance. For scene text recognition, the attention mechanism is usually combined with RNN structures as a module to predict the results. However, RNN attention-based methods are sometimes hard to converge on account of gradient vanishing/exploding during training, and RNN cannot be computed in parallel. To remedy this issue, we propose a Swin Transformer-based encoder-decoder mechanism, which relies entirely on the self attention mechanism (SAM) and can be computed in parallel. SAM is an efficient text recognizer that is only formed by two components: 1) an encoder based on Swin Transformer that gets the visual information of input image, and 2) a Transformer-based decoder gets the final results by applying self attention to the output of encoder. Considering that the scale of scene text has a large variation in images, we apply the Swin Transformer to compute the visual features with shifted windows, which permits self attention computation to cross-window connections and limits for non-overlapping local window. Our method has improved in accuracy over previous methods at ICDAR2003, ICDAR2013, SVT, SVT-P, CUTE and ICDAR2015 by 0.9%, 3.2%, 0.8%, 1.3%, 1.7%, 1.1% respectively. Especially, our method achieved the fastest predict time of 0.02s per image.

Year	DOI	Venue
2022	10.1007/978-3-030-98358-1_35	MULTIMEDIA MODELING (MMM 2022), PT I
Keywords	DocType	Volume
Scene text recognition, Swin transformer, Attention	Conference	13141
ISSN	Citations	PageRank
0302-9743	0	0.34
References	Authors
0	5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Xiang Shuai	1	0	0.34
Xiao Wang	2	2	6.24
Wei Wang	3	0	0.34
Xin Yuan	4	1089	92.27
Xin Xu	5	1365	100.22

1