Abstract | ||
---|---|---|
As robots become more commonplace within society, the need for tools that enable non-robotics-experts to develop control algorithms, or policies, will increase. Learning from Demonstration (LfD) offers one promising approach, where the robot learns a policy from teacher task executions. In this work we present an algorithm that incorporates human teacher feedback to enable policy improvement from learner experience within an LfD framework. We present two imple- mentations of this algorithm, that differ in the sort of teacher feedback they provide. In the first implementation, called Bi- nary Critiquing (BC), the teacher provides a binary indication that highlights poorly performing portions of the execution. In the second implementation, called Advice-Operator Policy Improvement (A-OPI), the teacher provides a correction on poorly performing portions of the student execution. Most notably, these corrections are continuous-valued and appro- priate for low level motion control action spaces. The al- gorithms are applied to simulated and real robot validation domains. For both, policy performance is found to improve with teacher feedback. Specifically, with BC learner execu- tion success and efficiency come to exceed teacher perfor- mance. With A-OPI task success and accuracy are shown to be similar or superior to the typical LfD approach of correct- ing behavior through more teacher demonstrations. |
Year | Venue | Field |
---|---|---|
2009 | AAAI Spring Symposium: Agents that Learn from Human Teachers | Robot learning,Social robot,Robot control,Motion control,Computer science,sort,Implementation,Artificial intelligence,Robot,Robot motion control,Machine learning |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
7 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Brenna Argall | 1 | 1256 | 62.99 |
Brett Browning | 2 | 1178 | 55.70 |
Manuela Veloso | 3 | 8563 | 882.50 |