RLCFR: Minimize counterfactual regret by deep reinforcement learning - Citegraph

Paper Info

Title
RLCFR: Minimize counterfactual regret by deep reinforcement learning

Abstract
Counterfactual regret minimization (CFR) is a popular method to deal with decision-making problems of two-player zero-sum games with imperfect information. Unlike previous studies that mostly explored solving large-scale problems or accelerating the solution efficiency, we propose a framework, RLCFR, which aims at improving the generalization ability of the CFR method. In RLCFR, the game strategy is solved by CFR-based methods in a reinforcement learning (RL) framework. The dynamic procedure of the iterative interactive strategy updating is modeled as a Markov decision process (MDP). Our method then learns a policy to select the appropriate method of regret updating in the iteration process. In addition, a stepwise reward function is formulated to learn the action policy, which is proportional to how well the iteration strategy performs at each step. Extensive experimental results on various games showed that the generalization ability of our method is significantly improved compared with existing state-of-the-art methods.

Year	DOI	Venue
2022	10.1016/j.eswa.2021.115953	Expert Systems with Applications
Keywords	DocType	Volume
Counterfactual regret minimization,Decision-making,Imperfect information,Reinforcement learning	Journal	187
ISSN	Citations	PageRank
0957-4174	0	0.34
References	Authors
0	7

Authors (7 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Huale Li	1	0	1.01
Xuan Wang	2	291	57.12
Fengwei Jia	3	1	3.07
Yifan Li	4	1	1.16
Yulin Wu	5	0	1.01
Zhang Jiajia	6	3	6.01
Shuhan Qi	7	38	14.95

1