Multiagent learning in adaptive dynamic systems - Citegraph

Paper Info

Title
Multiagent learning in adaptive dynamic systems

Abstract
Classically, an approach to the multiagent policy learning supposed that the agents, via interactions and/or by using preliminary knowledge about the reward functions of all players, would find an interdependent solution called "equilibrium". Recently, however, certain researchers question the necessity and the validity of the concept of equilibrium as the most important multiagent solution concept. They argue that a "good" learning algorithm is one that is efficient with respect to a certain class of counterparts. Adaptive players is an important class of agents that learn their policies separately from the maintenance of the beliefs about their counterparts' future actions and make their decisions based on that policy and the current belief. In this paper, we propose an efficient learning algorithm in presence of the adaptive counterparts called Adaptive Dynamics Learner (ADL), which is able to learn an efficient policy over the opponents' adaptive dynamics rather than over the simple actions and beliefs and, by so doing, to exploit these dynamics to obtain a higher utility than any equilibrium strategy can provide. We tested our algorithm on a substantial representative set of the most known and demonstrative matrix games and observed that ADL agent is highly efficient against Adaptive Play Q-learning (APQ) agent and Infinitesimal Gradient Ascent (IGA) agent. In self-play, when possible, ADL is able to converge to a Pareto optimal strategy maximizing the welfare of all players.

Year	DOI	Venue
2007	10.1145/1329125.1329174	AAMAS
Keywords	Field	DocType
equilibrium strategy,adaptive counterpart,adaptive dynamic system,adaptive dynamic,adaptive dynamics learner,multiagent policy,efficient policy,adl agent,adaptive play q-learning,adaptive player,efficient learning algorithm,adaptation,solution concept	Interdependence,Gradient descent,Computer science,Demonstrative,Exploit,Multiagent learning,Artificial intelligence,Solution concept,Dynamical system,Infinitesimal,Machine learning	Conference
Citations	PageRank	References
3	0.42	10
Authors
2

Authors (2 rows)

Cited by (3 rows)

References (10 rows)

Name	Order	Citations	PageRank
Andriy Burkov	1	23	4.03
Chaib-draa, Brahim	2	1190	113.23

1