A finite-sample simulation study of cross validation in tree-based models - Citegraph

Paper Info

Title
A finite-sample simulation study of cross validation in tree-based models

Abstract
Cross validation (CV) has been widely used for choosing and evaluating statistical models. The main purpose of this study is to explore the behavior of CV in tree-based models. We achieve this goal by an experimental approach, which compares a cross-validated tree classifier with the Bayes classifier that is ideal for the underlying distribution. The main observation of this study is that the difference between the testing and training errors from a cross-validated tree classifier and the Bayes classifier empirically has a linear regression relation. The slope and the coefficient of determination of the regression model can serve as performance measure of a cross-validated tree classifier. Moreover, simulation reveals that the performance of a cross-validated tree classifier depends on the geometry, parameters of the underlying distributions, and sample sizes. Our study can explain, evaluate, and justify the use of CV in tree-based models when the sample size is relatively small.

Year	DOI	Venue
2009	10.1007/s10799-009-0052-7	Information Technology and Management
Keywords	Field	DocType
linear regression,cross validation,sample size,statistical model,coefficient of determination,bayes classifier,regression model	Pattern recognition,Naive Bayes classifier,Computer science,Statistical model,Artificial intelligence,Classifier (linguistics),Bayes error rate,Cross-validation,Bayes classifier,Sample size determination,Quadratic classifier	Journal
Volume	Issue	ISSN
10	4	1385-951X
Citations	PageRank	References
1	0.35	6
Authors
3

Authors (3 rows)

Cited by (1 rows)

References (6 rows)

Name	Order	Citations	PageRank
Seoung Bum Kim	1	205	34.54
Xiaoming Huo	2	157	24.83
K. Leung	3	487	67.33

1