On the Convergence of Competitive, Multi-Agent Gradient-Based Learning. - Citegraph

Paper Info

Title
On the Convergence of Competitive, Multi-Agent Gradient-Based Learning.

Abstract
As learning algorithms are increasingly deployed in markets and other competitive environments, understanding their dynamics is becoming increasingly important. We study the limiting behavior of competitive agents employing gradient-based learning algorithms. Specifically, we introduce a general framework for competitive gradient-based learning that encompasses a wide breadth of learning algorithms including policy gradient reinforcement learning, gradient based bandits, and certain online convex optimization algorithms. We show that unlike the single agent case, gradient learning schemes in competitive settings do not necessarily correspond to gradient flows and, hence, it is possible for limiting behaviors like periodic orbits to exist. We introduce a new class of games, Morse-Smale games, that correspond to gradient-like flows. We provide guarantees that competitive gradient-based learning algorithms (both in the full information and gradient-free settings) avoid linearly unstable critical points (i.e. strict saddle points and unstable limit cycles). Since generic local Nash equilibria are not unstable critical points---that is, in a formal mathematical sense, almost all Nash equilibria are not strict saddles---these results imply that gradient-based learning almost surely does not get stuck at critical points that do not correspond to Nash equilibria. For Morse-Smale games, we show that competitive gradient learning converges to linearly stable cycles (which includes stable Nash equilibria) almost surely. Finally, we specialize these results to commonly used multi-agent learning algorithms and provide illustrative examples that demonstrate the wide range of limiting behaviors competitive gradient learning exhibits.

Year	Venue	Field
2018	arXiv: Learning	Convergence (routing),Mathematical optimization,Saddle point,Computer science,Critical point (mathematics),Almost surely,Nash equilibrium,Convex optimization,Limiting,Reinforcement learning
DocType	Volume	Citations
Journal	abs/1804.05464	0
PageRank	References	Authors
0.34	15	2

Authors (2 rows)

Cited by (0 rows)

References (15 rows)

Name	Order	Citations	PageRank
Eric Mazumdar	1	13	7.50
Lillian J. Ratliff	2	87	23.32

1