Greedy when Sure and Conservative when Uncertain about the Opponents. - Citegraph

Paper Info

Title
Greedy when Sure and Conservative when Uncertain about the Opponents.

Abstract
We develop a new approach, named Greedy when Sure and Conservative when Uncertain (GSCU), to competing online against unknown and nonstationary opponents. GSCU improves in four aspects: 1) introduces a novel way of learning opponent policy embeddings offline; 2) trains offline a single best response (conditional additionally on our opponent policy embedding) instead of a finite set of separate best responses against any opponent; 3) computes online a posterior of the current opponent policy embedding, without making the discrete and ineffective decision which type the current opponent belongs to; and 4) selects online between a real-time greedy policy and a fixed conservative policy via an adversarial bandit algorithm, gaining a theoretically better regret than adhering to either. Experimental studies on popular benchmarks demonstrate GSCU’s superiority over the state-of-the-art methods. The code is available online at \url{https://github.com/YeTianJHU/GSCU}.

Year	Venue	DocType
2022	International Conference on Machine Learning	Conference
Citations	PageRank	References
0	0.34	0
Authors
11

Authors (11 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Haobo Fu	1	5	1.11
Ye Tian	2	0	0.34
Hongxiang Yu	3	0	0.34
Weiming Liu	4	0	1.01
Shuang Wu	5	0	0.68
Jiechao Xiong	6	0	0.34
Ying Wen	7	0	2.37
Kai Li	8	0	0.34
Junliang Xing	9	1193	63.31
Qiang Fu	10	1	4.42
Wei Yang	11	0	0.34

1