Abstract | ||
---|---|---|
Reinforcement Learning (RL) has been effectively used to solve complex problems given careful design of the problem and algorithm parameters. However standard RL approaches do not scale particularly well with the size of the problem and often require extensive engineering on the part of the designer to minimize the search space. To alleviate this problem, we present a model-free policy-based approach called Exploration from Demonstration (EfD) that uses human demonstrations to guide search space exploration. We use statistical measures of RL algorithms to provide feedback to the user about the agent's uncertainty and use this to solicit targeted demonstrations useful from the agent's perspective. The demonstrations are used to learn an exploration policy that actively guides the agent towards important aspects of the problem. We instantiate our approach in a gridworld and a popular arcade game and validate its performance under different experimental conditions. We show how EfD scales to large problems and provides convergence speed-ups over traditional exploration and interactive learning methods. |
Year | Venue | Field |
---|---|---|
2016 | AAMAS | Convergence (routing),Interactive Learning,Active learning,Computer science,Space exploration,Artificial intelligence,Error-driven learning,Machine learning,Reinforcement learning,Complex problems |
DocType | Citations | PageRank |
Conference | 9 | 0.55 |
References | Authors | |
24 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Kaushik Subramanian | 1 | 10 | 1.62 |
Charles L. Isbell | 2 | 504 | 65.79 |
Andrea Lockerd Thomaz | 3 | 1115 | 84.85 |