Lazy-MDPs: Towards Interpretable RL by Learning When to Act. | 0 | 0.34 | 2022 |
Continuous Control with Action Quantization from Demonstrations. | 0 | 0.34 | 2022 |
Learning Natural Language Generation with Truncated Reinforcement Learning | 0 | 0.34 | 2022 |
On the role of population heterogeneity in emergent communication | 0 | 0.34 | 2022 |
Solving N-Player Dynamic Routing Games with Congestion: A Mean-Field Approach. | 0 | 0.34 | 2022 |
Offline Reinforcement Learning With Pseudometric Learning | 0 | 0.34 | 2021 |
Show me the Way: Intrinsic Motivation from Demonstrations | 0 | 0.34 | 2021 |
Mean Field Games Flock! The Reinforcement Learning Way. | 0 | 0.34 | 2021 |
Adversarially Guided Actor-Critic | 0 | 0.34 | 2021 |
Self-Imitation Advantage Learning | 0 | 0.34 | 2021 |
What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study | 0 | 0.34 | 2021 |
Hyperparameter Selection for Imitation Learning | 0 | 0.34 | 2021 |
Fictitious Play for Mean Field Games: Continuous Time Analysis and Applications | 0 | 0.34 | 2020 |
On The Convergence Of Model Free Learning In Mean Field Games | 0 | 0.34 | 2020 |
HIGhER: Improving instruction following with Hindsight Generation for Experience Replay | 0 | 0.34 | 2020 |
Supervised Seeded Iterated Learning for Interactive Language Learning. | 0 | 0.34 | 2020 |
Countering Language Drift with Seeded Iterated Learning | 0 | 0.34 | 2020 |
Self-Attentional Credit Assignment for Transfer in Reinforcement Learning | 1 | 0.36 | 2020 |
Foolproof Cooperative Learning. | 0 | 0.34 | 2020 |
CopyCAT:: Taking Control of Neural Policies with Constant Attacks | 0 | 0.34 | 2020 |
Leverage the Average: an Analysis of KL Regularization in Reinforcement Learning | 0 | 0.34 | 2020 |
Scaling up budgeted reinforcement learning. | 0 | 0.34 | 2019 |
Deep Conservative Policy Iteration | 0 | 0.34 | 2019 |
Observational Learning by Reinforcement Learning | 0 | 0.34 | 2019 |
Self-Educated Language Agent with Hindsight Experience Replay for Instruction Following. | 0 | 0.34 | 2019 |
Targeted Attacks on Deep Reinforcement Learning Agents through Adversarial Observations. | 0 | 0.34 | 2019 |
A Theory of Regularized Markov Decision Processes. | 0 | 0.34 | 2019 |
Learning from a Learner | 0 | 0.34 | 2019 |
Foolproof Cooperative Learning. | 0 | 0.34 | 2019 |
Playing the Game of Universal Adversarial Perturbations. | 1 | 0.34 | 2018 |
Observe and Look Further: Achieving Consistent Performance on Atari. | 10 | 0.46 | 2018 |
End-to-end optimization of goal-driven and visually grounded dialogue systems. | 21 | 0.81 | 2017 |
Noisy Networks for Exploration. | 46 | 1.44 | 2017 |
Modulating early visual processing by language. | 29 | 0.92 | 2017 |
LIG-CRIStAL System for the WMT17 Automatic Post-Editing Task. | 0 | 0.34 | 2017 |
Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards. | 26 | 0.81 | 2017 |
Is the Bellman residual a bad proxy? | 0 | 0.34 | 2017 |
Observational Learning by Reinforcement Learning. | 0 | 0.34 | 2017 |
On the Use of Non-Stationary Strategies for Solving Two-Player Zero-Sum Markov Games. | 5 | 0.42 | 2016 |
Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation. | 1 | 0.35 | 2016 |
Should one minimize the expected Bellman residual or maximize the mean value? | 0 | 0.34 | 2016 |
Difference of Convex Functions Programming Applied to Control with Expert Data. | 0 | 0.34 | 2016 |
Score-based Inverse Reinforcement Learning. | 0 | 0.34 | 2016 |
MultiVec: a Multilingual and Multilevel Representation Learning Toolkit for NLP. | 10 | 0.56 | 2016 |
Softened Approximate Policy Iteration for Markov Games. | 4 | 0.42 | 2016 |
Compact and Interpretable Dialogue State Representation with Genetic Sparse Distributed Memory. | 0 | 0.34 | 2016 |
Imitation Learning Applied to Embodied Conversational Agents | 2 | 0.38 | 2015 |
Bayesian Credible Intervals for Online and Active Learning of Classification Trees | 0 | 0.34 | 2015 |
Inverse reinforcement learning in relational domains | 4 | 0.41 | 2015 |
Optimism in Active Learning. | 2 | 0.39 | 2015 |