Tackling Morpion Solitaire with AlphaZero-like Ranked Reward Reinforcement Learning - Citegraph

Paper Info

Title
Tackling Morpion Solitaire with AlphaZero-like Ranked Reward Reinforcement Learning

Abstract
Morpion Solitaire is a popular single player game, performed with paper and pencil. Due to its large state space (on the order of the game of Go) traditional search algorithms, such as MCTS, have not been able to find good solutions. A later algorithm, Nested Rollout Policy Adaptation, was able to find a new record of 82 steps, albeit with large computational resources. After achieving this record, to the best of our knowledge, there has been no further progress reported, for about a decade. In this paper we take the recent impressive performance of deep self-learning reinforcement learning approaches from AlphaGo/AlphaZero as inspiration to design a searcher for Morpion Solitaire. A challenge of Morpion Solitaire is that the state space is sparse, there are few win/loss signals. Instead, we use an approach known as ranked reward to create a reinforcement learning self-play framework for Morpion Solitaire. This enables us to find medium-quality solutions with reasonable computational effort. Our record is a 67 steps solution, which is very close to the human best (68) without any other adaptation to the problem than using reward. We list many further avenues for potential improvement.

Year	DOI	Venue
2020	10.1109/SYNASC51798.2020.00033	2020 22nd International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)
Keywords	DocType	ISSN
Morpion Solitaire,Ranked Reward,Reinforcement Learning,AlphaZero,Self-play	Conference	2470-8801
ISBN	Citations	PageRank
978-1-7281-7629-1	0	0.34
References	Authors
17	4

Authors (4 rows)

Cited by (0 rows)

References (17 rows)

Name	Order	Citations	PageRank
Hui Wang	1	1	0.69
Preuss Mike	2	933	81.70
Michael Emmerich	3	1243	71.89
Aske Plaat	4	524	72.18

1