Title | ||
---|---|---|
On opportunistic techniques for solving decentralized Markov decision processes with temporal constraints |
Abstract | ||
---|---|---|
Decentralized Markov Decision Processes (DEC-MDPs) are a popular model of agent-coordination problems in domains with uncertainty and time constraints but very difficult to solve. In this paper, we improve a state-of-the-art heuristic solution method for DEC-MDPs, called OC-DEC-MDP, that has recently been shown to scale up to larger DEC-MDPs. Our heuristic solution method, called Value Function Propagation (VFP), combines two orthogonal improvements of OC-DEC-MDP. First, it speeds up OC-DEC-MDP by an order of magnitude by maintaining and manipulating a value function for each state (as a function of time) rather than a separate value for each pair of sate and time interval. Furthermore, it achieves better solution qualities than OC-DEC-MDP because, as our analytical results show, it does not overestimate the expected total reward like OC-DEC- MDP. We test both improvements independently in a crisis-management domain as well as for other types of domains. Our experimental results demonstrate a significant speedup of VFP over OC-DEC-MDP as well as higher solution qualities in a variety of situations. |
Year | DOI | Venue |
---|---|---|
2007 | 10.1145/1329125.1329389 | AAMAS |
Keywords | Field | DocType |
decentralized markov decision process,opportunistic technique,temporal constraint,value function,markov decision process,multi agent systems | Mathematical optimization,Heuristic,Computer science,Partially observable Markov decision process,Markov decision process,Bellman equation,Multi-agent system,Order of magnitude,Speedup | Conference |
Citations | PageRank | References |
12 | 0.64 | 14 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Janusz Marecki | 1 | 685 | 49.06 |
Milind Tambe | 2 | 6008 | 522.25 |