Title | Citations | PageRank | Year |
---|---|---|---|
Adaptive Federated Optimization. | 0 | 0.34 | 2021 |
Can gradient clipping mitigate label noise? | 0 | 0.34 | 2020 |
Learning to Learn by Zeroth-Order Oracle. | 0 | 0.34 | 2020 |
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes. | 0 | 0.34 | 2020 |
Are Transformers universal approximators of sequence-to-sequence functions? | 0 | 0.34 | 2020 |
On the Convergence of Adam and Beyond. | 0 | 0.34 | 2018 |
Fast stochastic optimization on Riemannian manifolds. | 3 | 0.40 | 2016 |