Title
Negative Update Intervals in Deep Multi-Agent Reinforcement Learning
Abstract
In Multi-Agent Reinforcement Learning (MA-RL), independent cooperative learners must overcome a number of pathologies to learn optimal joint policies. Addressing one pathology often leaves approaches vulnerable towards others. For instance, hysteretic Qlearning [15] addresses miscoordination while leaving agents vulnerable towards misleading stochastic rewards. Other methods, such as leniency, have proven more robust when dealing with multiple pathologies simultaneously [29]. However, leniency has predominately been studied within the context of strategic form games (bimatrix games) and fully observable Markov games consisting of a small number of probabilistic state transitions. This raises the question of whether these findings scale to more complex domains. For this purpose we implement a temporally extend version of the Climb Game [3], within which agents must overcome multiple pathologies simultaneously, including relative overgeneralisation, stochasticity, the alter-exploration and moving target problems, while learning from a large observation space. We find that existing lenient and hysteretic approaches fail to consistently learn near optimal joint-policies in this environment. To address these pathologies we introduce Negative Update Intervals-DDQN (NUI-DDQN), a Deep MA-RL algorithm which discards episodes yielding cumulative rewards outside the range of expanding intervals. NUI-DDQN consistently gravitates towards optimal joint-policies in our environment, overcoming the outlined pathologies.
Year
DOI
Venue
2018
10.5555/3306127.3331672
adaptive agents and multi-agents systems
Keywords
Field
DocType
Deep Multi-Agent Reinforcement Learning
Small number,Multiple pathologies,Overgeneralisation,Computer science,Markov chain,Artificial intelligence,Probabilistic logic,Machine learning,Management science,Reinforcement learning
Journal
Volume
Citations 
PageRank 
abs/1809.05096
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Gregory Palmer161.10
Rahul Savani224330.09
Karl Tuyls31272127.83