Title
Decentralized Learning for Optimality in Stochastic Dynamic Teams and Games With Local Control and Global State Information
Abstract
Stochastic dynamic teams and games are rich models for decentralized systems and challenging testing grounds for multiagent learning. Previous work that guaranteed team optimality assumed stateless dynamics, or an explicit coordination mechanism, or joint-control sharing. In this article, we present an algorithm with guarantees of convergence to team optimal policies in teams and common interest games. The algorithm is a two-timescale method that uses a variant of Q-learning on the finer timescale to perform policy evaluation while exploring the policy space on the coarser timescale. Agents following this algorithm are “independent learners”: they use only local controls, local cost realizations, and global state information, without access to controls of other agents. The results presented here are the first, to the best of our knowledge, to give formal guarantees of convergence to team optimality using independent learners in stochastic dynamic teams and common interest games.
Year
DOI
Venue
2022
10.1109/TAC.2021.3121228
IEEE Transactions on Automatic Control
Keywords
DocType
Volume
Cooperative control,game theory,machine learning,stochastic games,stochastic optimal control
Journal
67
Issue
ISSN
Citations 
10
0018-9286
0
PageRank 
References 
Authors
0.34
19
3
Name
Order
Citations
PageRank
Bora Yongacoglu100.68
Gurdal Arslan212315.09
Serdar Yüksel345753.31