On the Expressivity of Markov Reward (Extended Abstract) | 0 | 0.34 | 2022 |
Discovering a set of policies for the worst case reward. | 0 | 0.34 | 2021 |
Behaviour Suite for Reinforcement Learning. | 0 | 0.34 | 2020 |
Generative Adversarial Self-Imitation Learning. | 1 | 0.35 | 2018 |
Minimax-Regret Querying on Side Effects for Safe Optimality in Factored Markov Decision Processes. | 0 | 0.34 | 2018 |
Learning to Query, Reason, and Answer Questions On Ambiguous Texts. | 0 | 0.34 | 2017 |
Learning and predicting dynamic networked behavior with graphical multiagent models | 0 | 0.34 | 2012 |
Strong mitigation: nesting search for good policies within search for good reward | 0 | 0.34 | 2012 |
Planning and evaluating multiagent influences under reward uncertainty | 0 | 0.34 | 2012 |