Title
A new two-stage hybrid feature selection algorithm and its application in Chinese medicine
Abstract
High-dimensional small sample data are prone to the curse of dimensionality and overfitting and contain many irrelevant and redundant features. In order to solve these feature selection problems, a new Two-stage Hybrid Feature Selection Algorithm (Ts-HFSA) is proposed. The first stage uses the Filter method combined with the Wrapper method to adaptively remove irrelevant features. In the second stage, a De-redundancy Algorithm of Fusing Approximate Markov Blanket with L1 Regular Term (DA2MBL1) is used to solve the AMB’s problem of information loss when deleting redundant features and potential redundancy in the subset of features obtained by AMB. The experimental results on multiple UCI public data sets and datasets from the material foundation of Chinese medicine showed that the Ts-HFSA better deleted irrelevant features and redundant features, found smaller and higher quality feature subsets, and improved stability, indicating that it offers more advantages than AMB, FCBF, RF, GBDT, XGBoost, Lasso, and CI_AMB. Moreover, in the face of data of the material foundation of Chinese medicine, with higher feature dimensions and fewer sample sizes, Ts-HFSA performed better, which can also improve the precision of the model after greatly reducing the dimension. The results indicated that Ts-HFSA is an effective method for feature selection of high-dimensional small samples and an excellent research method for the material foundation of Chinese medicine.
Year
DOI
Venue
2022
10.1007/s13042-021-01445-y
International Journal of Machine Learning and Cybernetics
Keywords
DocType
Volume
Feature selection, High-dimensional small sample, Approximate Markov blanket, Material foundation of Chinese medicine
Journal
13
Issue
ISSN
Citations 
5
1868-8071
0
PageRank 
References 
Authors
0.34
16
6
Name
Order
Citations
PageRank
Zhiqin Li100.34
Jianqiang Du200.68
Bin Nie3275.56
Wangping Xiong400.34
Guoliang Xu592.08
Jigen Luo600.68