Homo-Heterogenous Transformer Learning Framework for RS Scene Classification - Citegraph

Paper Info

Title
Homo-Heterogenous Transformer Learning Framework for RS Scene Classification

Abstract
Remote sensing (RS) scene classification plays an essential role in the RS community and has attracted increasing attention due to its wide applications. Recently, benefiting from the powerful feature learning capabilities of convolutional neural networks (CNNs), the accuracy of the RS scene classification has significantly been improved. Although the existing CNN-based methods achieve excellent results, there is still room for improvement. First, the CNN-based methods are adept at capturing the global information from RS scenes. Still, the context relationships hidden in RS scenes cannot be thoroughly mined. Second, due to the specific structure, it is easy for normal CNNs to exploit the heterogenous information from RS scenes. Nevertheless, the homogenous information, which is also crucial to comprehensively understand complex contents within RS scenes, does not get the attention it deserves. Third, most CNNs focus on establishing the relationships between RS scenes and semantic labels. However, the similarities between them are not considered deeply, which are helpful to distinguish the intra-/interclass samples. To overcome the limitations mentioned previously, we propose a homo-heterogenous transformer learning (HHTL) framework for the RS scene classification in this article. First, a patch generation module is designed to generate homogenous and heterogenous patches. Then, a dual-branch feature learning module (FLM) is proposed to mine homogenous and heterogenous information within RS scenes simultaneously. In the FLM, based on vision transformer, not only the global information but also the local areas and their context information can be captured. Finally, we design a classification module, which consists of a fusion submodule and a metric-learning module. It can integrate homo-heterogenous information and compact/separate samples from the same/different RS scene categories. Extensive experiments are conducted on four public RS scene datasets. The encouraging results demonstrate that our HHTL framework can outperform many state-of-the-art methods. Our source codes are available at the below website.

Year	DOI	Venue
2022	10.1109/JSTARS.2022.3155665	IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING
Keywords	DocType	Volume
Feature extraction, Transformers, Remote sensing, Semantics, Task analysis, Representation learning, Measurement, Homo-heterogenous transformer, metric learning, remote sensing (RS) scene classification	Journal	15
ISSN	Citations	PageRank
1939-1404	0	0.34
References	Authors
0	6

Authors (6 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Jingjing Ma	1	0	1.01
Mingteng Li	2	0	0.68
Tang, X.	3	47	8.15
Xiangrong Zhang	4	493	48.70
Fang Liu	5	1188	125.46
Licheng Jiao	6	5698	475.84

1