Title | ||
---|---|---|
Hamilton-Jacobi Deep Q-Learning for Deterministic Continuous-Time Systems with Lipschitz Continuous Controls |
Abstract | ||
---|---|---|
In this paper, we propose Q-learning algorithms for continuous-time deterministic optimal control problems with Lipschitz continuous controls. A new class of Hamilton-Jacobi- Bellman (HJB) equations is derived from applying the dynamic programming principle to continuous-time Q-functions. Our method is based on a novel semi-discrete version of the HJB equation, which is proposed to design a Q-learning algorithm that uses data collected in discrete time without discretizing or approximating the system dynamics. We identify the conditions under which the Q-function estimated by this algorithm converges to the optimal Q-function. For practical implementation, we propose the Hamilton-Jacobi DQN, which extends the idea of deep Q-networks (DQN) to our continuous control setting. This approach does not require actor networks or numerical solutions to optimization problems for greedy actions since the HJB equation provides a simple characterization of optimal controls via ordinary differential equations. We empirically demonstrate the performance of our method through benchmark tasks and high-dimensional linear-quadratic problems. |
Year | DOI | Venue |
---|---|---|
2021 | v22/20-1235.html | JOURNAL OF MACHINE LEARNING RESEARCH |
Keywords | DocType | Volume |
Q-learning, Deep Q-networks, Continuous-time dynamical systems, Optimal control, Hamilton-Jacobi-Bellman equations | Journal | 22 |
Issue | ISSN | Citations |
1 | 1532-4435 | 0 |
PageRank | References | Authors |
0.34 | 0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jeongho Kim | 1 | 0 | 0.34 |
Jaeuk Shin | 2 | 0 | 0.34 |
Insoon Yang | 3 | 35 | 9.17 |