Recall Traces: Backtracking Models for Efficient Reinforcement Learning - Citegraph

Paper Info

Title
Recall Traces: Backtracking Models for Efficient Reinforcement Learning

Abstract
In many environments only a tiny subset of all states yield high reward. In these cases, few of the interactions with the environment provide a relevant learning signal. Hence, we may want to preferentially train on those high-reward states and the probable trajectories leading to them. To this end, we advocate for the use of a backtracking model that predicts the preceding states that terminate at a given high-reward state. We can train a model which, starting from a high value state (or one that is estimated to have high value), predicts and sample for which the (state, action)-tuples may have led to that high value state. These traces of (state, action) pairs, which we refer to as Recall Traces, sampled from this backtracking model starting from a high value state, are informative as they terminate in good states, and hence we can use these traces to improve a policy. We provide a variational interpretation for this idea and a practical algorithm in which the backtracking model samples from an approximate posterior distribution over trajectories which lead to large rewards. Our method improves the sample efficiency of both on- and off-policy RL algorithms across several environments and tasks.

Year	Venue	Field
2019	ICLR	Computer science,Artificial intelligence,Backtracking,Recall,Machine learning,Reinforcement learning
DocType	Citations	PageRank
Conference	0	0.34
References	Authors
0	8

Authors (8 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Anirudh Goyal Alias Parth Goyal	1	2	1.37
Philemon Brakel	2	236	11.60
William Fedus	3	49	5.01
Soumye Singhal	4	0	0.68
Timothy P. Lillicrap	5	4377	170.65
Sergey Levine	6	3377	182.21
Hugo Larochelle	7	7692	488.99
Yoshua Bengio	8	42677	3039.83

1