Posterior Sampling-based Reinforcement Learning for Control of Unknown Linear Systems - Citegraph

Paper Info

Title
Posterior Sampling-based Reinforcement Learning for Control of Unknown Linear Systems

Abstract
We propose a posterior sampling-based learning algorithm for the linear quadratic (LQ) control problem with unknown system parameters. The algorithm is called posterior sampling-based reinforcement learning for LQ regulator (PSRL-LQ) where two stopping criteria determine the lengths of the dynamic episodes in posterior sampling. The first stopping criterion controls the growth rate of episode length. The second stopping criterion is triggered when the determinant of the sample covariance matrix is less than half of the previous value. We show under some conditions on the prior distribution that the expected (Bayesian) regret of PSRL-LQ accumulated up to time <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$T$</tex-math></inline-formula> is bounded by <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\tilde{O}(\sqrt{T})$</tex-math></inline-formula> . Here, <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\tilde{O}(\cdot)$</tex-math></inline-formula> hides constants and logarithmic factors. Numerical simulations are provided to illustrate the performance of PSRL-LQ.

Year	DOI	Venue
2020	10.1109/TAC.2019.2950156	IEEE Transactions on Automatic Control
Keywords	DocType	Volume
Heuristic algorithms,Aerospace electronics,Bayes methods,Adaptive control,Reinforcement learning,Optimal control,Perturbation methods	Journal	65
Issue	ISSN	Citations
8	0018-9286	0
PageRank	References	Authors
0.34	4	3

Authors (3 rows)

Cited by (0 rows)

References (4 rows)

Name	Order	Citations	PageRank
Yi Ouyang	1	43	10.16
Mukul Gagrani	2	16	4.52
Rahul Jain	3	784	71.51

1