Title
An Variable Selection Method of the Significance Multivariate Correlation Competitive Population Analysis for Near-Infrared Spectroscopy in Chemical Modeling
Abstract
The high dimensionality of spectral datasets makes it difficult to select the optimal subset of variables. This paper presents a new method for variable selection called the significant multivariate competitive population analysis (SMCPA), Which combines ideas of significant multivariate correlation (SMC) and model population analysis, and employs weighted bootstrap sampling (WBS) and exponential decline function (EDF) competition methods. In this study, the values of SMC distributions are used as an index for evaluating the importance of each wavelength. Then, based on the importance level of each wavelength. SMCPA sequentially selects N subsets of spectral wavelengths by N Monte Carlo sampling in an iterative and competitive procedure. In each sampling run, a fixed ratio of samples is used to build a calibrated partial least-squares model, and then SMC is performed to obtain the score and threshold values. Next, based on the significant multivariate correlation scores, the key variables are selected by two steps: the compulsory selection of exponential decline function and the competitive selection of adaptive weighted sampling. Finally, cross-validation(CV) is applied to select the optimal subset with the lowest root mean square error. This method is tested on three NIR spectral datasets and compared against three high-performance variable selection methods. The experimental results show that the proposed algorithm has the highest efficiency and the best selection effect, and can usually locate the optimal combination of key wavelength variables in a dataset. The evaluation result after PLS modeling is also the best.
Year
DOI
Venue
2019
10.1109/ACCESS.2019.2954115
IEEE ACCESS
Keywords
DocType
Volume
Spectrochemical analysis,variable selection,the significant multivariate correlation,weighted bootstrap sampling,model population analysis,monte Carlo sampling,analytical techniques,partial least squares method
Journal
7
ISSN
Citations 
PageRank 
2169-3536
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Yuxi Wang100.34
Zhenhong Jia22915.13
Jie Yang328257.59