Deterministic policy gradient: Convergence analysis. | 0 | 0.34 | 2022 |
PER-ETD: A Polynomially Efficient Emphatic Temporal Difference Learning Method | 0 | 0.34 | 2022 |
Model-Based Offline Meta-Reinforcement Learning with Regularization | 0 | 0.34 | 2022 |
Sample Complexity Bounds For Two Timescale Value-Based Reinforcement Learning Algorithms | 0 | 0.34 | 2021 |
When Will Generative Adversarial Imitation Learning Algorithms Attain Global Convergence | 0 | 0.34 | 2021 |
Proximal Gradient Descent-Ascent: Variable Convergence under KŁ Geometry | 0 | 0.34 | 2021 |
Non-Asymptotic Convergence Of Adam-Type Reinforcement Learning Algorithms Under Markovian Sampling | 0 | 0.34 | 2021 |
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality | 0 | 0.34 | 2021 |
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee | 0 | 0.34 | 2021 |
Proximal Gradient Descent-Ascent: Variable Convergence under KŁ Geometry. | 0 | 0.34 | 2021 |
Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms | 0 | 0.34 | 2020 |
Reanalysis of Variance Reduced Temporal Difference Learning. | 0 | 0.34 | 2020 |
Reanalysis of Variance Reduced Temporal Difference Learning | 0 | 0.34 | 2020 |
Finite-Sample Analysis for SARSA with Linear Function Approximation | 0 | 0.34 | 2019 |
Two Time-scale Off-Policy TD Learning: Non-asymptotic Analysis over Markovian Samples | 0 | 0.34 | 2019 |
Finite-Sample Analysis for SARSA and Q-Learning with Linear Function Approximation. | 1 | 0.34 | 2019 |
Convergence of SGD in Learning ReLU Models with Separable Data. | 0 | 0.34 | 2018 |