Title
The correctness problem: evaluating the ordering of binary features in rankings
Abstract
In machine learning, feature ranking (FR) algorithms are used to rank features by relevance to the class variable. FR algorithms are mostly investigated for the feature selection problem and less studied for the problem of ranking. This paper focuses on the latter. A question asked about the problem of ranking given in the terminology of FR is: as different FR criteria estimate the relationship between a feature and the class variable differently on a given data, can we determine which criterion better captures the \"true\" feature-to-class relationship and thus generates the most \"correct\" order of individual features? This is termed as the \"correctness\" problem. It requires a reference ordering against which the ranks assigned to features by a FR algorithm are directly compared. The reference ranking is generally unknown for real-life data. In this paper, we show through theoretical and empirical analysis that for two-class classification tasks represented with binary data, the ordering of binary features based on their individual predictive powers can be used as a benchmark. Thus, allowing us to test how correct is the ordering of a FR algorithm. Based on these ideas, an evaluation method termed as FR evaluation strategy (FRES) is proposed. Rankings of three different FR criteria (relief, mutual information, and the diff-criterion) are investigated on five artificially generated and four real-life binary data sets. The results indicate that FRES works equally good for synthetic and real-life data and the diff-criterion generates the most correct orderings for binary data.
Year
DOI
Venue
2014
10.1007/s10115-013-0631-0
Knowl. Inf. Syst.
Keywords
Field
DocType
binary data,feature ranking evaluation,the ranking problem,variables ordering
Evaluation strategy,Data mining,Feature selection,Ranking,Computer science,Correctness,Artificial intelligence,Mutual information,Binary data,Class variable,Machine learning,Binary number
Journal
Volume
Issue
ISSN
39
3
0219-3116
Citations 
PageRank 
References 
3
0.38
29
Authors
3
Name
Order
Citations
PageRank
Kashif Javed11108.87
Mehreen Saeed2877.32
Haroon Atique Babri32266.97