Fast Task Inference with Variational Intrinsic Successor Features - Citegraph

Paper Info

Title
Fast Task Inference with Variational Intrinsic Successor Features

Abstract
It has been established that diverse behaviors spanning the controllable subspace of a Markov decision process can be trained by rewarding a policy for being distinguishable from other policies. However, one limitation of this formulation is the difficulty to generalize beyond the finite set of behaviors being explicitly learned, as may be needed in subsequent tasks. Successor features provide an appealing solution to this generalization problem, but require defining the reward function as linear in some grounded feature space. In this paper, we show that these two techniques can be combined, and that each method solves the otheru0027s primary limitation. To do so we introduce Variational Intrinsic Successor FeatuRes (VISR), a novel algorithm which learns controllable features that can be leveraged to provide enhanced generalization and fast task inference through the successor features framework. We empirically validate VISR on the full Atari suite, in a novel setup wherein the rewards are only exposed briefly after a long unsupervised phase. Achieving human-level performance on 12 games and beating all baselines, we believe VISR represents a step towards agents that rapidly learn from limited feedback.

Year	Venue	DocType
2020	ICLR	Conference
Citations	PageRank	References
1	0.36	0
Authors
6

Authors (6 rows)

Cited by (1 rows)

References (0 rows)

Name	Order	Citations	PageRank
Steven Hansen	1	3	1.40
William Dabney	2	270	17.86
André Barreto	3	12	5.65
David Warde-Farley	4	1413	101.45
Tom Van de Wiele	5	3	1.40
Volodymyr Mnih	6	3796	158.28

1