Abstract | ||
---|---|---|
Reinforcement learning is a powerful machine learning paradigm that allows agents to autonomously learn to maximize a scalar reward. However, it often suffers from poor initial performance and long learning times. This paper discusses how collecting on-line human feedback, both in real time and post hoc, can potentially improve the performance of such learning systems. We use the game Pac-Man to simulate a navigation setting and show that workers are able to accurately identify both when a sub-optimal action is executed, and what action should have been performed instead. Our results demonstrate that the crowd is capable of generating helpful input. We conclude with a discussion the types of errors that occur most commonly when engaging human workers for this task, and a discussion of how such data could be used to improve learning. Our work serves as a critical first step in designing systems that use real-time human feedback to improve the learning performance of automated systems on-the-fly. |
Year | Venue | Field |
---|---|---|
2015 | AAAI Workshop: Learning for General Competency in Video Games | Robot learning,Active learning (machine learning),Computer science,Artificial intelligence,Error-driven learning,Machine learning,Reinforcement learning |
DocType | Citations | PageRank |
Conference | 1 | 0.35 |
References | Authors | |
10 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Gabriel V. de la Cruz | 1 | 12 | 2.41 |
Bei Peng | 2 | 43 | 6.96 |
Walter Lasecki | 3 | 833 | 67.19 |
Matthew E. Taylor | 4 | 1352 | 94.88 |