Abstract | ||
---|---|---|
We consider the problem of nonparametric regression, consisting of learning an arbitrary mapping f:X-Y from a data set of (x,y) pairs in which the y values are corrupted by noise of mean zero. This statistical task is known to be subject to a severe curse of dimensionality: if X@?R^D, and if the only smoothness assumption on f is that it satisfies a Lipschitz condition, it is known that any estimator based on n data points will have an error rate (risk) of @W(n^-^2^/^(^2^+^D^)). Here we present a tree-based regressor whose risk depends only on the doubling dimension of X, not on D. This notion of dimension generalizes two cases of contemporary interest: when X is a low-dimensional manifold, and when X is sparse. The tree is built using random hyperplanes as splitting criteria, building upon recent work of Dasgupta and Freund (2008) [5]; and we show that axis-parallel splits cannot achieve the same finite-sample rate of convergence. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1016/j.jcss.2012.01.002 | J. Comput. Syst. Sci. |
Keywords | Field | DocType |
doubling dimension,contemporary interest,finite-sample rate,mean zero,lipschitz condition,error rate,tree-based regressor,arbitrary mapping,intrinsic dimension,nonparametric regression,low-dimensional manifold,n data point,manifold,sparse data | Data point,Discrete mathematics,Combinatorics,Nonparametric regression,Curse of dimensionality,Intrinsic dimension,Rate of convergence,Lipschitz continuity,Hyperplane,Mathematics,Estimator | Journal |
Volume | Issue | ISSN |
78 | 5 | 0022-0000 |
Citations | PageRank | References |
12 | 0.78 | 12 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Samory Kpotufe | 1 | 92 | 11.56 |
Sanjoy Dasgupta | 2 | 2052 | 172.00 |