Title
Leaf Area Index Estimation Algorithm for GF-5 Hyperspectral Data Based on Different Feature Selection and Machine Learning Methods.
Abstract
Leaf area index (LAI) is an essential vegetation parameter that represents the light energy utilization and vegetation canopy structure. As the only in-operation hyperspectral satellite launched by China, GF-5 is potentially useful for accurate LAI estimation. However, there is no research focus on evaluating GF-5 data for LAI estimation. Hyperspectral remote sensing data contains abundant information about the reflective characteristics of vegetation canopies, but these abound data also easily result in a dimensionality curse. Therefore, feature selection (FS) is necessary to reduce data redundancy to achieve more reliable estimations. Currently, machine learning (ML) algorithms have been widely used for FS. Moreover, the same ML algorithm is usually conducted for both FS and regression in LAI estimation. However, no evidence suggests that this is the optimal solution. Therefore, this study focuses on evaluating the capacity of GF-5 spectral reflectance for estimating LAI and the performances of different combination of FS and ML algorithms. Firstly, the PROSAIL model, which coupled leaf optical properties model PROSPECT and the scattering by arbitrarily inclined leaves (SAIL) model, was used to generate simulated GF-5 reflectance data under different vegetation and soil conditions, and then three FS methods, including random forest (RF), K-means clustering (K-means) and mean impact value (MIV), and three ML algorithms, including random forest regression (RFR), back propagation neural network (BPNN) and K-nearest neighbor (KNN) were used to develop nine LAI estimation models. The FS process was conducted twice using different strategies: Firstly, three FS methods were conducted to search the lowest dimension number, which maintained the estimation accuracy of all bands. Then, the sequential backward selection (SBS) method was used to eliminate the bands having minimal impact on LAI estimation accuracy. Finally, three best estimation models were selected and evaluated using reference LAI. The results showed that although the RF_RFR model (RF used for feature selection and RFR used for regression) achieved reliable LAI estimates (coefficient of determination (R-2) = 0.828, root mean square error (RMSE) = 0.839), the poor performance (R-2= 0.763, RMSE = 0.987) of the MIV_BPNN model (MIV used for feature selection and BPNN used for regression) suggested using feature selection and regression conducted by the same ML algorithm could not always ensure an optimal estimation. Moreover, RF selection preserved the most informative bands for LAI estimation so that each ML regression method could achieve satisfactory estimation results. Finally, the results indicated that the RF_KNN model (RF used as feature selection and KNN used for regression) with seven GF-5 spectral band reflectance achieved the better estimation results than others when validated by simulated data (R-2= 0.834, RMSE = 0.824) and actual reference LAI (R-2= 0.659, RMSE = 0.697).
Year
DOI
Venue
2020
10.3390/rs12132110
REMOTE SENSING
Keywords
DocType
Volume
GF-5,LAI,feature selection,machine learning
Journal
12
Issue
Citations 
PageRank 
13
0
0.34
References 
Authors
0
11
Name
Order
Citations
PageRank
Zhulin Chen100.68
kun jia23011.31
Chenchao Xiao300.34
Dandan Wei400.34
Xiang Zhao5419.19
Jinhui Lan6216.55
xiangqin wei7407.02
Yunjun Yao810530.36
Bing Wang901.01
Yuan Sun1000.34
Lei Wang1100.68