Title
Control strategies for a stochastic planner
Abstract
We present new algorithms for local planning over Markov decision processes. The base-level algorithm possesses several interesting features for control of computation, based on selecting computations accord- ing to their expected benefit to decision quality. The algorithms are shown to expand the agent's knowledge where the world warrants it, with appropriate respon- siveness to time pressure and randomness. We then develop an introspective algorithm, using an internal representation of what computational work has already been done. This strategy extends the agent's knowl- edge base where warranted by the agent's world model and the agent's knowledge of the work already put into various parts of this model. It also enables the agent to act so as to take advantage of the computational savings inherent in staying in known parts of the state space. The control flexibility provided by this strategy, by in- corporating natural problem-solving methods, directs computational effort towards where it's needed better than previous approaches, providing grcatcr hopes for scalability to large domains. assign the goal state a reward of 0 and all other states a reward of - 1. Problems using such a reward function in- clude the path-planning problem on a grid with obstacles and imperfect motor control, and the ubiquitous 8-puzzle, but with random errors associated with actions. (The model can also handle problems having several stop states of dif- fercnt values.) In this domain, a plan takes the form of a policy assigning to each state an action choice. The agent tries to choose a policy maximizing its cumulative reward. (For domains involving unbounded time, it is common to discount future gains by an amount exponential in time to
Year
Venue
Keywords
1994
AAAI
stochastic planner,control strategy,path planning,motor control,cumulant,markov decision process,state space
Field
DocType
ISBN
Mathematical optimization,Computer science,Markov decision process,Planner,Artificial intelligence,Knowledge base,Decision quality,State space,Machine learning,Computation,Scalability,Randomness
Conference
0-262-61102-3
Citations 
PageRank 
References 
24
8.00
6
Authors
2
Name
Order
Citations
PageRank
Jonathan Tash1248.00
Stuart J. Russell25731796.47