Abstract | ||
---|---|---|
Inverse reinforcement learning (IRL) is the problem of learning the preferences of an agent from observing its behavior on a task. While this problem is witnessing sustained attention, the related problem of online IRL where the observations are incrementally accrued, yet the real-time demands of the application often prohibit a full rerun of an IRL method has received much less attention. We introduce a formal framework for online IRL, called incremental IRL (12RL), and a new method that advances maximum entropy IRL with hidden variables, to this setting. Our analysis shows that the new method has a monotonically improving performance with more demonstration data, as well as probabilistically bounded error, both under full and partial observability. Experiments in a simulated robotic application, which involves learning under occlusion, show the significantly improved performance of 12RL as compared to both batch IRL and an online imitation learning method. |
Year | DOI | Venue |
---|---|---|
2019 | 10.5555/3306127.3331818 | adaptive agents and multi-agents systems |
Keywords | Field | DocType |
Robot Learning,Online Learning,Robotics,Reinforcement Learning,Inverse Reinforcement Learning | Robot learning,Monotonic function,Observability,Computer science,Inverse reinforcement learning,Artificial intelligence,Principle of maximum entropy,Hidden variable theory,Machine learning,Robotics,Reinforcement learning | Conference |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Saurabh Arora | 1 | 5 | 1.48 |
Prashant Doshi | 2 | 926 | 90.23 |
Bikramjit Banerjee | 3 | 284 | 32.63 |