Title | ||
---|---|---|
Towards High Level Skill Learning: Learn to Return Table Tennis Ball Using Monte-Carlo Based Policy Gradient Method |
Abstract | ||
---|---|---|
Deep learning has achieved a great success in both visual and acoustic recognition and classification tasks. The accuracy of many state-of-the-art methods have surpassed that of human beings. However, in the field of robotics, it remains to be a big challenge for a real robot to master a high-level skill using deep learning methods, even though human can easily learn the task from demonstration, imitation and practice. Compared to Go and Atari games, this kind of tasks is usually continuous in both state space and action space, which makes value based reinforcement learning methods unavailable. Making a robot learn to return a ball to a desired point in table tennis is such a typical task. It would be a promising step if a robot can learn to play table tennis without the exact knowledge of the models in this sport just as human players do. In this paper, we consider such a kind of motion decision skill learning, a one-step decision making process, and give a Monte-Carlo based reinforcement learning method in the framework of Deep Deterministic Policy Gradient. Then we apply this method in robotic table tennis and test it on two tasks. The first one is to return balls to a desired point first, and the second one is to return balls to randomly selected landing points. The experimental results demonstrate that the trained policy can successfully return balls of random motion state to both a designated point and randomly selected landing points with high accuracy. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1109/RCAR.2018.8621776 | 2018 IEEE International Conference on Real-time Computing and Robotics (RCAR) |
Keywords | Field | DocType |
table tennis ball,deep learning methods,state space,action space,motion decision skill learning,Monte-Carlo based reinforcement learning method,robotic table tennis,random motion state,deep deterministic policy gradient,landing points,high level skill learning,Monte-Carlo based policy gradient method,one-step decision making process | Gradient method,Computer science,Ball (bearing),Artificial intelligence,Imitation,Deep learning,Robot,State space,Robotics,Reinforcement learning | Conference |
ISBN | Citations | PageRank |
978-1-5386-6870-2 | 1 | 0.35 |
References | Authors | |
7 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yifeng Zhu | 1 | 513 | 35.33 |
Yongsheng Zhao | 2 | 75 | 19.66 |
Lisen Jin | 3 | 1 | 0.35 |
Jun Wu | 4 | 456 | 75.01 |
Rong Xiong | 5 | 53 | 14.05 |