Learning in Zero-Sum Team Markov Games Using Factored Value Functions - Citegraph

Paper Info

Title
Learning in Zero-Sum Team Markov Games Using Factored Value Functions

Abstract
We present a new method for learning good strategies in zero-sum Markov games in which each side is composed of multiple agents col- laborating against an opposing team of agents. Our method requires full observability and communication during learning, but the learned poli- cies can be executed in a distributed manner. The value function is rep- resented as a factored linear architecture and its structure determines the necessary computational resources and communication bandwidth. This approach permits a tradeoff between simple representations with little or no communication between agents and complex, computationally inten- sive representations with extensive coordination between agents. Thus, we provide a principled means of using approximation to combat the exponential blowup in the joint action space of the participants. The ap- proach is demonstrated with an example that shows the efficiency gains over naive enumeration.

Year	Venue	Keywords
2002	NIPS	col,value function
Field	DocType	Citations
Architecture,Observability,Exponential function,Computer science,Enumeration,Markov chain,Bellman equation,Communication bandwidth,Artificial intelligence,Machine learning	Conference	5
PageRank	References	Authors
0.49	6	2

Authors (2 rows)

Cited by (5 rows)

References (6 rows)

Name	Order	Citations	PageRank
Michail G. Lagoudakis	1	1164	79.51
Ronald Parr	2	2428	186.85

1