Direction Matters: On the Implicit Bias of Stochastic Gradient Descent with Moderate Learning Rate. | 0 | 0.34 | 2021 |
How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks? | 0 | 0.34 | 2021 |
On the Global Convergence of Training Deep Linear ResNets. | 0 | 0.34 | 2020 |
Improving Adversarial Robustness Requires Revisiting Misclassified Examples. | 0 | 0.34 | 2020 |