Abstract | ||
---|---|---|
A precise, controllable, interpretable and easily trainable text removal approach is necessary for both user-specific and large-scale text removal applications. To achieve this, we propose a one-stage mask-based text inpainting network, MTRNet++. It has a novel architecture that includes mask-refine, coarse-inpainting and fine-inpainting branches, and attention blocks. With this architecture, MTRNet++ can remove text either with or without an external mask. It achieves state-of-the-art results on both the Oxford and SCUT datasets without using external ground-truth masks. The results of ablation studies demonstrate that the proposed multi-branch architecture with attention blocks is effective and essential. It also demonstrates controllability and interpretability. |
Year | DOI | Venue |
---|---|---|
2020 | 10.1016/j.cviu.2020.103066 | Computer Vision and Image Understanding |
Keywords | DocType | Volume |
41A05,41A10,65D05,65D17 | Journal | 201 |
Issue | ISSN | Citations |
1 | 1077-3142 | 3 |
PageRank | References | Authors |
0.40 | 0 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Tursun Osman | 1 | 3 | 0.40 |
Simon Denman | 2 | 509 | 56.72 |
rui zeng | 3 | 21 | 4.18 |
Sabesan Sivapalan | 4 | 54 | 3.36 |
Sridha Sridharan | 5 | 2092 | 222.69 |
Clinton Fookes | 6 | 743 | 97.41 |