Cooperation in stochastic games through communication - Citegraph

Paper Info

Title
Cooperation in stochastic games through communication

Abstract
The application of reinforcement learning principles to the search of equilibrium policies in stochastic games (SGs) has met with some success ([3], [4], [2]). The key insight of this approach is that each agent can learn his own ß-discounted reward equilibrium policy by keeping track of Q-values of all the agents including himself, and considering the Q-value matrix for each state as his payoff matrix. Each agent sees what actions other agents take, and what payoffs they receive. There is some evidence that in practice, agents that do not observe the actions and payoffs of other agents (hereby denoted as imperfectly observing agents), can still learn adversarial equilibrium (AE) policies in general-sum SGs ([1]) using naive Q-learning. Considering the Prisoners' Dilemma stage game (Table 1) as an abstraction of a SG, this implies that, even by ignoring other agents' play, agents still learn to play DD, which is the adversarial equilibrium joint action. The payoff received in DD can be thought of as each agent's security level.

Year	DOI	Venue
2005	10.1145/1082473.1082691	AAMAS
Keywords	Field	DocType
stochastic game,key insight,dilemma stage game,naive q-learning,joint action,adversarial equilibrium,discounted reward equilibrium policy,q-value matrix,equilibrium policy,payoff matrix,general-sum sgs,reinforcement learning,combinatorial auction	Mathematical economics,Abstraction,Computer science,Combinatorial auction,Repeated game,Artificial intelligence,Normal-form game,Dilemma,Machine learning,Stochastic game,Reinforcement learning,Adversarial system	Conference
ISBN	Citations	PageRank
1-59593-093-0	2	0.39
References	Authors
5	3

Authors (3 rows)

Cited by (2 rows)

References (5 rows)

Name	Order	Citations	PageRank
Raghav Aras	1	35	3.32
Alain Dutech	2	86	11.37
François Charpillet	3	448	54.11

1