Abstract | ||
---|---|---|
Reinforcement learning (RL) makes it possible to train agents capable of achievingsophisticated goals in complex and uncertain environments. A key difficulty inreinforcement learning is specifying a reward function for the agent to optimize.Traditionally, imitation learning in RL has been used to overcome this problem.Unfortunately, hitherto imitation learning methods tend to require that demonstrationsare supplied in the first-person: the agent is provided with a sequence ofstates and a specification of the actions that it should have taken. While powerful,this kind of imitation learning is limited by the relatively hard problem of collectingfirst-person demonstrations. Humans address this problem by learning fromthird-person demonstrations: they observe other humans perform tasks, infer thetask, and accomplish the same task themselves.In this paper, we present a method for unsupervised third-person imitation learning.Here third-person refers to training an agent to correctly achieve a simplegoal in a simple environment when it is provided a demonstration of a teacherachieving the same goal but from a different viewpoint; and unsupervised refersto the fact that the agent receives only these third-person demonstrations, and isnot provided a correspondence between teacher states and student states. Ourmethods primary insight is that recent advances from domain confusion can beutilized to yield domain agnostic features which are crucial during the trainingprocess. To validate our approach, we report successful experiments on learningfrom third-person demonstrations in a pointmass domain, a reacher domain, andinverted pendulum. |
Year | Venue | Field |
---|---|---|
2017 | international conference on learning representations | Inverted pendulum,Confusion,Computer science,Cognitive imitation,Imitation,Artificial intelligence,Error-driven learning,Imitation learning,Machine learning,Reinforcement learning |
DocType | Citations | PageRank |
Conference | 19 | 0.78 |
References | Authors | |
22 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
bradly c stadie | 1 | 82 | 6.02 |
Pieter Abbeel | 2 | 6363 | 376.48 |
Ilya Sutskever | 3 | 25814 | 1120.24 |