Title
Learning Classical Planning Strategies with Policy Gradient.
Abstract
A common paradigm in classical planning is heuristic forward search. Forward search planners often rely on relatively simple best-first search algorithm, which remains fixed throughout the search process. In this paper, we introduce a novel search framework capable of alternating between several forward search approaches while solving a particular planning problem. Selection of the approach is performed using a trainable stochastic policy. This enables tailoring the search strategy to a particular distribution of planning problems and a selected performance metric, such as the IPC score or running time. We construct a strategy space using five search algorithms and a two-dimensional representation of the planneru0027s state. Strategies are then trained on randomly generated planning problems using policy gradient. Experimental results show that the learner is able to discover domain-specific search strategies, thus improving the planneru0027s performance with respect to the chosen metric.
Year
Venue
Field
2018
international conference on automated planning and scheduling
Heuristic,Search algorithm,Computer science,Performance metric,Planner,Artificial intelligence,Machine learning
DocType
Volume
Citations 
Journal
abs/1810.09923
0
PageRank 
References 
Authors
0.34
0
3
Name
Order
Citations
PageRank
Pawel Gomoluch100.34
Dalal Alrajeh211913.75
Alessandra Russo3102280.10