Title | ||
---|---|---|
Hjb-Rl:Initializing Reinforcement Learning With Optimal Control Policies Applied To Autonomous Drone Racing |
Abstract | ||
---|---|---|
In this work we present a planning and control method for a quadrotor in an autonomous drone race. Our method combines the advantages of both model-based optimal control and model-free deep reinforcement learning. We consider a single drone racing on a track marked by a series of gates, through which it must maneuver in minimum time. Firstly we solve the discretized Hamilton-Jacobi-Bellman (HJB) equation to produce a closed-loop policy for a simplified, reduced order model of the drone. Next, we train a deep network policy in a supervised fashion to mimic the HJB policy. Finally, we further train this network using policy gradient reinforcement learning on the full drone dynamics model with a low-level feedback controller in the loop. This gives a deep network policy for controlling the drone to pass through a single gate. In a race course, this policy is applied successively to each new oncoming gate to guide the drone through the course. The resulting policy completes a high-fidelity AirSim drone race with 12 gates in 34.89s (on average), outracing a model-based HJB policy by 33.20s, a supervised learning policy by 1.24s, and a trajectory planning policy by 12.99s, while a model-free RL policy was never able to complete the race. |
Year | DOI | Venue |
---|---|---|
2021 | 10.15607/RSS.2021.XVII.062 | ROBOTICS: SCIENCE AND SYSTEM XVII |
DocType | ISSN | Citations |
Conference | 2330-7668 | 0 |
PageRank | References | Authors |
0.34 | 0 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Keiko Nagami | 1 | 0 | 0.34 |
Mac Schwager | 2 | 0 | 0.68 |