Title
Multiscale Feature Learning by Transformer for Building Extraction From Satellite Images
Abstract
Extracting buildings from very high-resolution satellite images is a challenging yet important task for applications such as urban monitoring. Multiscale feature learning proves to be a potential solution toward accurate extraction of buildings. This study exploits a powerful multiscale feature learning module, a hierarchical vision transformer by shifted windows (swin), as a backbone within a building extraction network. To this end, we first designed a general structure for building extraction, consisting of a backbone to extract multiscale features and a head network to fuse and refine features. Then, we integrated swin into the structure as a backbone and utilized channel-wise and spatial-wise enhancement in a head network. Experimental results show that our method achieves improvements regarding both F1-score and intersection over union (IoU) compared to the multiple attending path neural network (MAP-Net), which is the current state-of-the-art (SOTA) algorithm for building extraction from remote sensing images. Our study thus confirms the potential of swin transformers as backbones for semantic segmentation tasks based on satellite images.
Year
DOI
Venue
2022
10.1109/LGRS.2022.3142279
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS
Keywords
DocType
Volume
Feature extraction, Buildings, Transformers, Satellites, Semantics, Windows, Training, Attention, building extraction, satellite remote sensing, semantic segmentation, transformer
Journal
19
ISSN
Citations 
PageRank 
1545-598X
0
0.34
References 
Authors
0
6
Name
Order
Citations
PageRank
Xi Chen133370.76
Chunping Qiu200.68
Wenyue Guo301.01
Anzhu Yu400.68
Xiaochong Tong500.34
Michael Schmitt600.34