Title | ||
---|---|---|
A Non-Parametric Regression Viewpoint : Generalization of Overparametrized Deep RELU Network Under Noisy Observations |
Abstract | ||
---|---|---|
We study the generalization properties of the overparameterized deep neural network (DNN) with Rectified Linear Unit (ReLU) activations.
Under the non-parametric regression framework, it is assumed that the ground-truth function is from a reproducing kernel Hilbert space (RKHS) induced by a neural tangent kernel (NTK) of ReLU DNN, and a dataset is given with the noises. Without a delicate adoption of early stopping, we prove that the overparametrized DNN trained by vanilla gradient descent does not recover the ground-truth function. It turns out that the estimated DNN's $L_{2}$ prediction error is bounded away from $0$. As a complement of the above result, we show that the $\ell_{2}$-regularized gradient descent enables the overparametrized DNN achieve the minimax optimal convergence rate of the $L_{2}$ prediction error, without early stopping. Notably, the rate we obtained is faster than $\mathcal{O}(n^{-1/2})$ known in the literature. |
Year | Venue | Keywords |
---|---|---|
2022 | International Conference on Learning Representations (ICLR) | Overparametrized Deep Neural Network,Neural Tangent Kernel,Minimax,Non-parametric regression |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Namjoon Suh | 1 | 0 | 0.34 |
Hyunouk Ko | 2 | 0 | 0.34 |
Xiaoming Huo | 3 | 157 | 24.83 |