Abstract | ||
---|---|---|
A core novelty of Alpha Zero is the interleaving of tree search and deep learning, which has proven very successful in board games like Chess, Shogi and Go. These games have a discrete action space. However, many real-world reinforcement learning domains have continuous action spaces, for example in robotic control, navigation and self-driving cars. This paper presents the necessary theoretical extensions of Alpha Zero to deal with continuous action space. We also provide some preliminary experiments on the Pendulum swing-up task, empirically showing the feasibility of our approach. Thereby, this work provides a first step towards the application of iterated search and learning in domains with a continuous action space. |
Year | Venue | Field |
---|---|---|
2018 | arXiv: Machine Learning | Robotic control,Artificial intelligence,Novelty,Deep learning,Pendulum,Iterated function,Machine learning,Interleaving,Mathematics,Reinforcement learning |
DocType | Volume | Citations |
Journal | abs/1805.09613 | 2 |
PageRank | References | Authors |
0.39 | 9 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Thomas M. Moerland | 1 | 12 | 1.58 |
Joost Broekens | 2 | 344 | 37.07 |
Aske Plaat | 3 | 524 | 72.18 |
Catholijn M. Jonker | 4 | 2252 | 241.53 |