Hjb-Rl:Initializing Reinforcement Learning With Optimal Control Policies Applied To Autonomous Drone Racing - Citegraph

Paper Info

Title
Hjb-Rl:Initializing Reinforcement Learning With Optimal Control Policies Applied To Autonomous Drone Racing

Abstract
In this work we present a planning and control method for a quadrotor in an autonomous drone race. Our method combines the advantages of both model-based optimal control and model-free deep reinforcement learning. We consider a single drone racing on a track marked by a series of gates, through which it must maneuver in minimum time. Firstly we solve the discretized Hamilton-Jacobi-Bellman (HJB) equation to produce a closed-loop policy for a simplified, reduced order model of the drone. Next, we train a deep network policy in a supervised fashion to mimic the HJB policy. Finally, we further train this network using policy gradient reinforcement learning on the full drone dynamics model with a low-level feedback controller in the loop. This gives a deep network policy for controlling the drone to pass through a single gate. In a race course, this policy is applied successively to each new oncoming gate to guide the drone through the course. The resulting policy completes a high-fidelity AirSim drone race with 12 gates in 34.89s (on average), outracing a model-based HJB policy by 33.20s, a supervised learning policy by 1.24s, and a trajectory planning policy by 12.99s, while a model-free RL policy was never able to complete the race.

Year	DOI	Venue
2021	10.15607/RSS.2021.XVII.062	ROBOTICS: SCIENCE AND SYSTEM XVII
DocType	ISSN	Citations
Conference	2330-7668	0
PageRank	References	Authors
0.34	0	2

Authors (2 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Keiko Nagami	1	0	0.34
Mac Schwager	2	0	0.68

1