Learning Classical Planning Strategies with Policy Gradient. - Citegraph

Paper Info

Title
Learning Classical Planning Strategies with Policy Gradient.

Abstract
A common paradigm in classical planning is heuristic forward search. Forward search planners often rely on relatively simple best-first search algorithm, which remains fixed throughout the search process. In this paper, we introduce a novel search framework capable of alternating between several forward search approaches while solving a particular planning problem. Selection of the approach is performed using a trainable stochastic policy. This enables tailoring the search strategy to a particular distribution of planning problems and a selected performance metric, such as the IPC score or running time. We construct a strategy space using five search algorithms and a two-dimensional representation of the planneru0027s state. Strategies are then trained on randomly generated planning problems using policy gradient. Experimental results show that the learner is able to discover domain-specific search strategies, thus improving the planneru0027s performance with respect to the chosen metric.

Year	Venue	Field
2018	international conference on automated planning and scheduling	Heuristic,Search algorithm,Computer science,Performance metric,Planner,Artificial intelligence,Machine learning
DocType	Volume	Citations
Journal	abs/1810.09923	0
PageRank	References	Authors
0.34	0	3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Pawel Gomoluch	1	0	0.34
Dalal Alrajeh	2	119	13.75
Alessandra Russo	3	1022	80.10

1