A Multi-Agent Off-Policy Actor-Critic Algorithm For Distributed Reinforcement Learning - Citegraph

Paper Info

Title
A Multi-Agent Off-Policy Actor-Critic Algorithm For Distributed Reinforcement Learning

Abstract
This paper extends off-policy reinforcement learning to the multi-agent case in which a set of networked agents communicating with their neighbors according to a time-varying graph collaboratively evaluates and improves a target policy while following a distinct behavior policy. To this end, the paper develops a multi-agent version of emphatic temporal difference learning for off-policy policy evaluation, and proves convergence under linear function approximation. The paper then leverages this result, in conjunction with a novel multi-agent off-policy policy gradient theorem and recent work in both multi-agent on-policy and single-agent off-policy actor-critic methods, to develop and give convergence guarantees for a new multi-agent off-policy actor-critic algorithm. An empirical validation of these theoretical results is given. Copyright (C) 2020 The Authors.

Year	DOI	Venue
2019	10.1016/j.ifacol.2020.12.2021	IFAC PAPERSONLINE
Keywords	DocType	Volume
consensus and reinforcement learning control, adaptive control of multi-agent systems	Journal	53
Issue	ISSN	Citations
2	2405-8963	0
PageRank	References	Authors
0.34	13	6

Authors (6 rows)

Cited by (0 rows)

References (13 rows)

Name	Order	Citations	PageRank
Wesley Suttle	1	0	0.34
zhuoran yang	2	52	29.86
Kaiqing Zhang	3	48	13.02
Zhaoran Wang	4	157	33.20
Tamer Basar	5	3497	402.11
Ji Liu	6	146	26.61

1