Active contextual policy search - Citegraph

Paper Info

Title
Active contextual policy search

Abstract
We consider the problem of learning skills that are versatilely applicable. One popular approach for learning such skills is contextual policy search in which the individual tasks are represented as context vectors. We are interested in settings in which the agent is able to actively select the tasks that it examines during the learning process. We argue that there is a better way than selecting each task equally often because some tasks might be easier to learn at the beginning and the knowledge that the agent can extract from these tasks can be transferred to similar but more difficult tasks. The methods that we propose for addressing the task-selection problem model the learning process as a nonstationary multi-armed bandit problem with custom intrinsic reward heuristics so that the estimated learning progress will be maximized. This approach does neither make any assumptions about the underlying contextual policy search algorithm nor about the policy representation. We present empirical results on an artificial benchmark problem and a ball throwing problem with a simulated Mitsubishi PA-10 robot arm which show that active context selection can improve the learning of skills considerably.

Year	DOI	Venue
2014	10.5555/2627435.2697072	Journal of Machine Learning Research
Keywords	Field	DocType
reinforcement learning,movement primitives,active learning,multi-task learning,policy search	Robot learning,Multi-task learning,Active learning,Active learning (machine learning),Computer science,Q-learning,Artificial intelligence,Cooperative learning,Proactive learning,Machine learning,Reinforcement learning	Journal
Volume	Issue	ISSN
15	1	1532-4435
Citations	PageRank	References
2	0.37	21
Authors
2

Authors (2 rows)

Cited by (2 rows)

References (21 rows)

Name	Order	Citations	PageRank
Alexander Fabisch	1	11	3.57
Jan Hendrik Metzen	2	374	27.06

1