Abstract | ||
---|---|---|
Promoting behavioural diversity is critical for solving games with non-transitive dynamics where strategic cycles exist, and there is no consistent winner (e.g., Rock-Paper-Scissors). Yet, there is a lack of rigorous treatment for defining diversity and constructing diversity-aware learning dynamics. In this work, we offer a geometric interpretation of behavioural diversity in games and introduce a novel diversity metric based on determinantal point processes (DPP). By incorporating the diversity metric into best-response dynamics, we develop diverse fictitious play and diverse policy-space response oracle for solving normal-form games and open-ended games. We prove the uniqueness of the diverse best response and the convergence of our algorithms on two-player games. Importantly, we show that maximising the DPP-based diversity metric guarantees to enlarge the gamescape - convex polytopes spanned by agents' mixtures of strategies. To validate our diversity-aware solvers, we test on tens of games that show strong non-transitivity. Results suggest that our methods achieve at least the same, and in most games, lower exploitability than PSRO solvers by finding effective and diverse strategies. |
Year | Venue | DocType |
---|---|---|
2021 | INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139 | Conference |
Volume | ISSN | Citations |
139 | 2640-3498 | 0 |
PageRank | References | Authors |
0.34 | 10 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Nicolas Perez Nieves | 1 | 0 | 0.34 |
Yaodong Yang | 2 | 41 | 11.92 |
Oliver Slumbers | 3 | 0 | 0.68 |
David Mguni | 4 | 1 | 2.71 |
Jun Wang | 5 | 2514 | 138.37 |