Regret Analysis For Learning In A Multi-Agent Linear-Quadratic Control Problem - Citegraph

Paper Info

Title
Regret Analysis For Learning In A Multi-Agent Linear-Quadratic Control Problem

Abstract
We consider a multi-agent Linear-Quadratic (LQ) reinforcement learning problem consisting of three systems, an unknown system and two known systems. In this problem, there are three agents - the actions of agent 1 can affect the unknown system as well as the two known systems while the actions of agents 2 and 3 can only affect their respective co-located known systems. Further, the unknown system's state can affect the known systems' state evolution. In this paper, we are interested in minimizing the infinite-horizon average cost. We propose a Thompson Sampling (TS)-based multi-agent learning algorithm where each agent learns the unknown system's dynamics independently. Our result indicates that the expected regret of our algorithm is upper bounded by (O) over tilde (root T) under certain assumptions, where (O) over tilde (center dot) hides constants and logarithmic factors. Numerical simulations are provided to illustrate the performance of our proposed algorithm.

Year	DOI	Venue
2020	10.23919/ACC45564.2020.9147525	2020 AMERICAN CONTROL CONFERENCE (ACC)
DocType	ISSN	Citations
Conference	0743-1619	0
PageRank	References	Authors
0.34	0	3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Seyed Mohammad Asghari	1	7	3.55
Mukul Gagrani	2	16	4.52
Ashutosh Nayyar	3	240	30.84

1