Policy Learning With An Effcient Black-Box Optimization Algorithm - Citegraph

Paper Info

Title
Policy Learning With An Effcient Black-Box Optimization Algorithm

Abstract
Robotic learning on real hardware requires an efficient algorithm which minimizes the number of trials needed to learn an optimal policy. Prolonged use of hardware causes wear and tear on the system and demands more attention from an operator. To this end, we present a novel black-box optimization algorithm, Reward Optimization with Compact Kernels and fast natural gradient regression (ROCK). Our algorithm immediately updates knowledge after a single trial and is able to extrapolate in a controlled manner. These features make fast and safe learning on real hardware possible. The performance of our method is evaluated with standard benchmark functions that are commonly used to test optimization algorithms. We also present three differerent robotic optimization examples using ROCK. The first robotic example is on a simulated robot arm, the second is on a real articulated legged system, and the third is on a simulated quadruped robot with 12 actuated joints. ROCK* outperforms the current state-of-the- art algorithms in all tasks sometimes even by an order of magnitude.

Year	DOI	Venue
2015	10.1142/S0219843615500292	INTERNATIONAL JOURNAL OF HUMANOID ROBOTICS
Keywords	Field	DocType
Policy optimization, robotic learning, black-box optimization	Black box (phreaking),Natural gradient,Robotic arm,Wear and tear,Simulation,Policy learning,Computer science,Optimization algorithm,Operator (computer programming),Robot	Journal
Volume	Issue	ISSN
12	3	0219-8436
Citations	PageRank	References
1	0.36	5
Authors
5

Authors (5 rows)

Cited by (1 rows)

References (5 rows)

Name	Order	Citations	PageRank
Jemin Hwangbo	1	22	2.72
Christian Gehring	2	180	13.79
Hannes Sommer	3	74	6.81
Roland Siegwart	4	7640	551.49
Jonas Buchli	5	1081	72.94

1