Minimum Width for Universal Approximation. | 0 | 0.34 | 2021 |
A unifying view on implicit bias in training linear neural networks. | 0 | 0.34 | 2021 |
Are Transformers universal approximators of sequence-to-sequence functions? | 0 | 0.34 | 2020 |
Efficiently testing local optimality and escaping saddles for ReLU networks. | 0 | 0.34 | 2019 |
Small nonlinearities in activation functions create bad local minima in neural networks. | 0 | 0.34 | 2019 |
Global Optimality Conditions for Deep Neural Networks. | 0 | 0.34 | 2018 |
Finite sample expressive power of small-width ReLU networks. | 1 | 0.35 | 2018 |
A Critical View of Global Optimality in Deep Learning. | 4 | 0.40 | 2018 |