Discovering a set of policies for the worst case reward. | 0 | 0.34 | 2021 |
Temporally-Extended ε-Greedy Exploration. | 0 | 0.34 | 2021 |
Fast Task Inference with Variational Intrinsic Successor Features. | 0 | 0.34 | 2020 |
Universal Successor Features Approximators. | 0 | 0.34 | 2019 |
Temporal Difference Learning with Neural Networks - Study of the Leakage Propagation Problem. | 0 | 0.34 | 2018 |
Unicorn: Continual Learning with a Universal, Off-policy Agent. | 2 | 0.37 | 2018 |
Entropic Policy Composition with Generalized Policy Improvement and Divergence Correction. | 0 | 0.34 | 2018 |