Model-Value Inconsistency as a Signal for Epistemic Uncertainty. | 0 | 0.34 | 2022 |
Generalised Policy Improvement with Geometric Policy Composition. | 0 | 0.34 | 2022 |
Expected Eligibility Traces | 0 | 0.34 | 2021 |
Discovering a set of policies for the worst case reward | 0 | 0.34 | 2021 |
On Efficiency in Hierarchical Reinforcement Learning | 0 | 0.34 | 2020 |
Fast Reinforcement Learning With Generalized Policy Updates | 2 | 0.43 | 2020 |
The Value Equivalence Principle for Model-Based Reinforcement Learning | 0 | 0.34 | 2020 |
Fast Task Inference with Variational Intrinsic Successor Features | 1 | 0.36 | 2020 |
The Option Keyboard: Combining Skills in Reinforcement Learning | 1 | 0.36 | 2019 |
Universal Successor Features Approximators | 1 | 0.36 | 2019 |
Adaptive Temporal-Difference Learning for Policy Evaluation with Per-State Uncertainty Estimates | 0 | 0.34 | 2019 |
Fast deep reinforcement learning using online adjustments from the past. | 1 | 0.35 | 2018 |
Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement. | 2 | 0.37 | 2018 |
Natural Value Approximators: Learning when to Trust Past Estimates. | 0 | 0.34 | 2017 |
Successor Features for Transfer in Reinforcement Learning. | 0 | 0.34 | 2017 |
The Predictron: End-To-End Learning and Planning. | 4 | 0.38 | 2017 |