Title
AlphaZero-based Proof Cost Network to Aid Game Solving
Abstract
The AlphaZero algorithm learns and plays games without hand-crafted expert knowledge. However, since its objective is to play well, we hypothesize that a better objective can be defined for the related but separate task of solving games. This paper proposes a novel approach to solving problems by modifying the training target of the AlphaZero algorithm, such that it prioritizes solving the game quickly, rather than winning. We train a Proof Cost Network (PCN), where proof cost is a heuristic that estimates the amount of work required to solve problems. This matches the general concept of the so-called proof number from proof number search, which has been shown to be well-suited for game solving. We propose two specific training targets. The first finds the shortest path to a solution, while the second estimates the proof cost. We conduct experiments on solving 15x15 Gomoku and 9x9 Killall-Go problems with both MCTS-based and FDFPN solvers. Comparisons between using AlphaZero networks and PCN as heuristics show that PCN can solve more problems.
Year
Venue
Keywords
2022
International Conference on Learning Representations (ICLR)
Monte-Carlo Tree Search,Solving Games,AlphaZero,Deep Reinforcement Learning
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
0
6
Name
Order
Citations
PageRank
Ti-Rong Wu111.71
Chung-Chin Shih2101.37
tinghan wei3107.89
Meng-Yu Tsai400.34
Wei-Yuan Hsu500.34
I-Chen Wu620855.03