Title | ||
---|---|---|
Impact of a metric of association between two variables on performance of filters for binary data |
Abstract | ||
---|---|---|
In the feature selection community, filters are quite popular. Design of a filter depends on two parameters, namely the objective function and the metric it employs for estimating the feature-to-class (relevance) and feature-to-feature (redundancy) association. Filter designers pay relatively more attention towards the objective function. But a poor metric can overshadow the goodness of an objective function. The metrics that have been proposed in the literature estimate the relevance and redundancy differently, thus raising the question: can the metric estimating the association between two variables improve the feature selection capability of a given objective function or in other words a filter. This paper investigates this question. Mutual information is the metric proposed for measuring the relevance and redundancy between the features for the mRMR filter [1] while the MBF filter [2] employs correlation coefficient. Symmetrical uncertainty, a variant of mutual information, is used by the fast correlation-based filter (FCBF) [3]. We carry out experiments on mRMR, MBF and FCBF filters with three different metrics (mutual information, correlation coefficient and diff-criterion) using three binary data sets and four widely used classifiers. We find that [email protected]?s performance is much better if it uses diff-criterion rather than correlation coefficient while mRMR with diff-criterion demonstrates performance better or comparable to mRMR with mutual information. For the FCBF filter, the diff-criterion also exhibits results much better than mutual information. |
Year | DOI | Venue |
---|---|---|
2014 | 10.1016/j.neucom.2014.05.066 | Neurocomputing |
Keywords | Field | DocType |
binary data,classification,feature selection,filters | Correlation coefficient,Pattern recognition,Feature selection,Correlation,Redundancy (engineering),Artificial intelligence,Mutual information,Binary data,Machine learning,Mathematics | Journal |
Volume | Issue | ISSN |
143 | 1 | 0925-2312 |
Citations | PageRank | References |
2 | 0.40 | 25 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Kashif Javed | 1 | 110 | 8.87 |
Haroon A. Babri | 2 | 81 | 4.63 |
Mehreen Saeed | 3 | 87 | 7.32 |