VSURF: An R Package for Variable Selection Using Random Forests - Citegraph

Paper Info

Title
VSURF: An R Package for Variable Selection Using Random Forests

Abstract
This paper describes the R package VSURF. Based on random forests, and for both regression and classification problems, it returns two subsets of variables. The first is a subset of important variables including some redundancy which can be relevant for interpretation, and the second one is a smaller subset corresponding to a model trying to avoid redundancy focusing more closely on the prediction objective. The two-stage strategy is based on a preliminary ranking of the explanatory variables using the random forests permutation-based score of importance and proceeds using a stepwise forward strategy for variable introduction. The two proposals can be obtained automatically using data-driven default values, good enough to provide interesting results, but strategy can also be tuned by the user. The algorithm is illustrated on a simulated example and its applications to real datasets are presented.

Year	DOI	Venue
2015	10.32614/rj-2015-018	R JOURNAL
Field	DocType	Volume
Data mining,Feature selection,Ranking,Regression,Computer science,Permutation,Redundancy (engineering),Artificial intelligence,Statistics,Random forest,Machine learning,R package	Journal	7
Issue	ISSN	Citations
2	2073-4859	0
PageRank	References	Authors
0.34	0	3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Robin Genuer	1	4	2.14
Jean-Michel Poggi	2	174	16.19
Christine Tuleau-Malot	3	87	5.23

1