A Deep-Local-Global Feature Fusion Framework for High Spatial Resolution Imagery Scene Classification. - Citegraph

Paper Info

Title
A Deep-Local-Global Feature Fusion Framework for High Spatial Resolution Imagery Scene Classification.

Abstract
High spatial resolution (HSR) imagery scene classification has recently attracted increased attention. The bag-of-visual-words (BoVW) model is an effective method for scene classification. However, it can only extract handcrafted features, and it disregards the spatial layout information, whereas deep learning can automatically mine the intrinsic features as well as preserve the spatial location, but it may lose the characteristic information of the HSR images. Although previous methods based on the combination of BoVW and deep learning have achieved comparatively high classification accuracies, they have not explored the combination of handcrafted and deep features, and they just used the BoVW model as a feature coding method to encode the deep features. This means that the intrinsic characteristics of these models were not combined in the previous works. In this paper, to discover more discriminative semantics for HSR imagery, the deep-local-global feature fusion (DLGFF) framework is proposed for HSR imagery scene classification. Differing from the conventional scene classification methods, which utilize only handcrafted features or deep features, DLGFF establishes a framework integrating multi-level semantics from the global texture feature-based method, the BoVW model, and a pre-trained convolutional neural network (CNN). In DLGFF, two different approaches are proposed, i. e., the local and global features fused with the pooling-stretched convolutional features (LGCF) and the local and global features fused with the fully connected features (LGFF), to exploit the multi-level semantics for complex scenes. The experimental results obtained with three HSR image classification datasets confirm the effectiveness of the proposed DLGFF framework. Compared with the published results of the previous scene classification methods, the classification accuracies of the DLGFF framework on the 21-class UC Merced dataset and 12-class Google dataset of SIRI-WHU can reach 99.76%, which is superior to the current state-of-the-art methods. The classification accuracy of the DLGFF framework on the 45-class NWPU-RESISC45 dataset, 96.37 +-0.05%, is an increase of about 6% when compared with the current state-of-the-art methods. This indicates that the fusion of the global low-level feature, the local mid-level feature, and the deep high-level feature can provide a representative description for HSR imagery.

Year	DOI	Venue
2018	10.3390/rs10040568	REMOTE SENSING
Keywords	Field	DocType
scene classification,deep feature,global low-level features,local feature,BoVW,high spatial resolution image,fusion	Computer vision,ENCODE,Feature fusion,Pattern recognition,Convolutional neural network,Artificial intelligence,Deep learning,Geology,Contextual image classification,Image resolution,Discriminative model,Semantics	Journal
Volume	Issue	Citations
10	4	1
PageRank	References	Authors
0.34	20	5

Authors (5 rows)

Cited by (1 rows)

References (20 rows)

Name	Order	Citations	PageRank
Qiqi Zhu	1	29	3.55
Yanfei Zhong	2	1044	90.58
Yanfei Liu	3	14	1.87
Liangpei Zhang	4	5448	307.02
Deren Li	5	620	74.26

1