Abstract | ||
---|---|---|
We investigate the problem of supervised feature selection within the filtering framework. In our approach, applicable to the two-class problems, the feature strength is inversely proportional to the p-value of the null hypothesis that its class-conditional densities, p(X\Y = 0) and p(X\Y = 1), are identical. To estimate the p-values, we use Fisher's permutation test combined with the four simple filtering criteria in the roles of test statistics: sample mean difference, symmetric Kullback-Leibler distance, information gain, and chi-square statistic. The experimental results of our study, performed using naive Bayes classifier and support vector machines, strongly indicate that the permutation test improves the above-mentioned filters and can be used effectively when sample size is relatively small and number of features relatively large. |
Year | DOI | Venue |
---|---|---|
2004 | 10.1007/978-3-540-30115-8_32 | Lecture Notes in Computer Science |
Keywords | Field | DocType |
naive bayes classifier,sample size,information gain,kullback leibler distance,support vector machine,feature selection,permutation test | Chi-square test,Naive Bayes classifier,Pattern recognition,Feature selection,Statistic,Permutation,Artificial intelligence,Resampling,Sample size determination,Statistical hypothesis testing,Mathematics | Conference |
Volume | ISSN | Citations |
3201 | 0302-9743 | 12 |
PageRank | References | Authors |
1.16 | 29 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Predrag Radivojac | 1 | 646 | 58.89 |
Zoran Obradovic | 2 | 1110 | 137.41 |
A. Keith Dunker | 3 | 466 | 77.54 |
Slobodan Vucetic | 4 | 637 | 56.38 |