Title
A new feature selection approach based on ensemble methods in semi-supervised classification
Abstract
In computer aided medical system, many practical classification applications are confronted to the massive multiplication of collection and storage of data, this is especially the case in areas such as the prediction of medical test efficiency, the classification of tumors and the detection of cancers. Data with known class labels (labeled data) can be limited but unlabeled data (with unknown class labels) are more readily available. Semi-supervised learning deals with methods for exploiting the unlabeled data in addition to the labeled data to improve performance on the classification task. In this paper, we consider the problem of using a large amount of unlabeled data to improve the efficiency of feature selection in large dimensional datasets, when only a small set of labeled examples is available. We propose a new semi-supervised feature evaluation method called Optimized co-Forest for Feature Selection (OFFS) that combines ideas from co-forest and the embedded principle of selecting in Random Forest based by the permutation of out-of-bag set. We provide empirical results on several medical and biological benchmark datasets, indicating an overall significant improvement of OFFS compared to four other feature selection approaches using filter, wrapper and embedded manner in semi-supervised learning. Our method proves its ability and effectiveness to select and measure importance to improve the performance of the hypothesis learned with a small amount of labeled samples by exploiting unlabeled samples.
Year
DOI
Venue
2017
10.1007/s10044-015-0524-9
Pattern Anal. Appl.
Keywords
Field
DocType
Feature selection, Semi-supervised learning, Ensemble methods, Co-forest, Random Forest, Large datasets, Medical diagnosis
Medical test,Semi-supervised learning,Feature selection,Pattern recognition,Computer science,Permutation,Multiplication,Artificial intelligence,Random forest,Small set,Ensemble learning,Machine learning
Journal
Volume
Issue
ISSN
20
3
1433-755X
Citations 
PageRank 
References 
0
0.34
39
Authors
4
Name
Order
Citations
PageRank
Nesma Settouti1376.33
Mohammed Amine Chikh2302.82
Vincent Barra312514.42
ChikhMohamed Amine400.34