Approximating optimal policies for partially observable stochastic domains - Citegraph

Paper Info

Title
Approximating optimal policies for partially observable stochastic domains

Abstract
The problem of making optimal decisions in uncertain conditions is central to Artificial Intelligence If the state of the world is known at all times, the world can be modeled as a Markov Decision Process (MDP) MDPs have been studied extensively and many methods are known for determining optimal courses of action or policies. The more realistic case where state information is only partially observable Partially Observable Markov Decision Processes (POMDPs) have received much less attention. The best exact algorithms for these problems can be very inefficient in both space and time. We introduce Smooth Partially Observable Value Approximation (SPOVA), a new approximation method that can quickly yield good approximations which can improve over time. This method can be combined with reinforcement learning meth ods a combination that was very effective in our test cases.

Year	Venue	Keywords
1995	IJCAI	state information,smooth partially observable value,markov decision process,observable partially observable markov,artificial intelligence,optimal decision,observable stochastic domain,decision processes,exact algorithm,optimal policy,optimal course,new approximation method,artificial intelligent,reinforcement learning
Field	DocType	ISBN
Mathematical optimization,Observable,Computer science,Partially observable Markov decision process,Markov model,Spacetime,Markov decision process,Q-learning,Artificial intelligence,Test case,Machine learning,Reinforcement learning	Conference	1-55860-363-8
Citations	PageRank	References
66	21.65	5
Authors
2

Authors (2 rows)

Cited by (66 rows)

References (5 rows)

Name	Order	Citations	PageRank
Ronald Parr	1	2428	186.85
Stuart J. Russell	2	5731	796.47

1