Quantifying Differences in Reward Functions. | 0 | 0.34 | 2021 |
Quantifying Differences in Reward Functions | 0 | 0.34 | 2021 |
Learning Human Objectives by Evaluating Hypothetical Behavior | 0 | 0.34 | 2020 |
Pitfalls of learning a reward function online | 0 | 0.34 | 2020 |
Learning to Understand Goal Specifications by Modelling Reward. | 0 | 0.34 | 2019 |
Learning to Understand Goal Specifications by Modelling Reward | 0 | 0.34 | 2019 |
Reward learning from human preferences and demonstrations in Atari. | 2 | 0.35 | 2018 |
Scaling shared model governance via model splitting. | 0 | 0.34 | 2018 |
Learning to Follow Language Instructions with Adversarial Reward Induction. | 4 | 0.39 | 2018 |
Scalable agent alignment via reward modeling: a research direction. | 3 | 0.37 | 2018 |
Jointly Learning "What" and "How" from Instructions and Goal-States. | 0 | 0.34 | 2018 |
Geometric Nontermination Arguments. | 0 | 0.34 | 2018 |
On Thompson Sampling and Asymptotic Optimality. | 0 | 0.34 | 2017 |
AI Safety Gridworlds. | 0 | 0.34 | 2017 |
Universal Reinforcement Learning Algorithms: Survey and Experiments. | 4 | 0.53 | 2017 |
Generalised Discount Functions applied to a Monte-Carlo AI u Implementation. | 1 | 0.37 | 2017 |
Deep Reinforcement Learning from Human Preferences | 51 | 1.34 | 2017 |
Loss Bounds and Time Complexity for Speed Priors. | 0 | 0.34 | 2016 |
Ultimate Automizer with Two-track Proofs - (Competition Contribution). | 7 | 0.43 | 2016 |
Nonparametric General Reinforcement Learning. | 0 | 0.34 | 2016 |
Thompson Sampling is Asymptotically Optimal in General Environments. | 8 | 0.61 | 2016 |
Exploration Potential. | 0 | 0.34 | 2016 |
A Formal Solution to the Grain of Truth Problem. | 2 | 0.37 | 2016 |
Sequential Extensions of Causal and Evidential Decision Theory | 3 | 0.45 | 2015 |
A Definition of Happiness for Reinforcement Learning Agents. | 3 | 0.40 | 2015 |
Bad Universal Priors and Notions of Optimality | 12 | 0.75 | 2015 |
Ranking Templates for Linear Loops | 0 | 0.34 | 2015 |
Ultimate Automizer with Array Interpolation - (Competition Contribution). | 3 | 0.38 | 2015 |
On the Computability of Solomonoff Induction and Knowledge-Seeking | 5 | 0.51 | 2015 |
On the Computability of AIXI | 5 | 0.64 | 2015 |
Ranking Templates for Linear Loops. | 20 | 0.76 | 2014 |
Ranking Function Synthesis for Linear Lasso Programs. | 2 | 0.47 | 2014 |
Geometric Series as Nontermination Arguments for Linear Lasso Programs. | 2 | 0.39 | 2014 |
Linear Ranking for Linear Lasso Programs. | 12 | 0.55 | 2013 |
Synthesis For Polynomial Lasso Programs | 1 | 0.35 | 2013 |