Title
MUSE: Minimum Uncertainty and Sample Elimination Based Binary Feature Selection
Abstract
This paper presents a novel incremental feature selection method based on minimum uncertainty and feature sample elimination (referred as MUSE). Feature selection is an important step in machine learning. In an incremental feature selection approach, past approaches have attempted to increase class relevance while simultaneously minimizing redundancy with previously selected features. One example of such an approach is the feature selection method of minimum Redundancy Maximum Relevance (mRMR). The proposed approach differs from prior mRMR approach in how the redundancy of the current feature with previously selected features is reduced. In the proposed approach, the feature samples are divided into a pre-specified number of bins; this step is referred to as feature quantization. A novel uncertainty score for each feature is computed by summing the conditional entropies of the bins, and the feature with the lowest uncertainty score is selected. For each bin, its impurity is computed by taking the minimum of the probability of Class 1 and of Class 2. The feature samples corresponding to the bins with impurities below a threshold are discarded and are not used for selection of the subsequent features. The significance of the MUSE feature selection method is demonstrated using the two datasets: arrhythmia and hand digit recognition (Gisette), and datasets for seizure prediction from five dogs and two humans. It is shown that the proposed method outperforms the prior mRMR feature selection method for most cases. For the arrhythmia dataset, the proposed method achieves 30% higher sensitivity at the expense of 7% loss of specificity. For the Gisette dataset, the proposed method achieves 15% higher accuracy for Class 2, at the expense of 3% lower accuracy for Class 1. With respect to seizure prediction among 5 dogs and 2 humans, the proposed method achieves higher area-under-curve (AUC) for all subjects.
Year
DOI
Venue
2019
10.1109/tkde.2018.2865778
IEEE Transactions on Knowledge and Data Engineering
Keywords
Field
DocType
Feature extraction,Uncertainty,Redundancy,Quantization (signal),Impurities,Mutual information,Computational modeling
Data mining,Feature selection,Bin,Pattern recognition,Computer science,Feature extraction,Redundancy (engineering),Mutual information,Artificial intelligence,Conditional entropy,Quantization (signal processing),Binary number
Journal
Volume
Issue
ISSN
31
9
1041-4347
Citations 
PageRank 
References 
1
0.35
0
Authors
2
Name
Order
Citations
PageRank
Zisheng Zhang110.35
keshab k parhi23235369.07