Title
Active contextual policy search
Abstract
We consider the problem of learning skills that are versatilely applicable. One popular approach for learning such skills is contextual policy search in which the individual tasks are represented as context vectors. We are interested in settings in which the agent is able to actively select the tasks that it examines during the learning process. We argue that there is a better way than selecting each task equally often because some tasks might be easier to learn at the beginning and the knowledge that the agent can extract from these tasks can be transferred to similar but more difficult tasks. The methods that we propose for addressing the task-selection problem model the learning process as a nonstationary multi-armed bandit problem with custom intrinsic reward heuristics so that the estimated learning progress will be maximized. This approach does neither make any assumptions about the underlying contextual policy search algorithm nor about the policy representation. We present empirical results on an artificial benchmark problem and a ball throwing problem with a simulated Mitsubishi PA-10 robot arm which show that active context selection can improve the learning of skills considerably.
Year
DOI
Venue
2014
10.5555/2627435.2697072
Journal of Machine Learning Research
Keywords
Field
DocType
reinforcement learning,movement primitives,active learning,multi-task learning,policy search
Robot learning,Multi-task learning,Active learning,Active learning (machine learning),Computer science,Q-learning,Artificial intelligence,Cooperative learning,Proactive learning,Machine learning,Reinforcement learning
Journal
Volume
Issue
ISSN
15
1
1532-4435
Citations 
PageRank 
References 
2
0.37
21
Authors
2
Name
Order
Citations
PageRank
Alexander Fabisch1113.57
Jan Hendrik Metzen237427.06