Title
Dynamic-Depth Context Tree Weighting.
Abstract
Reinforcement learning (RL) in partially observable settings is challenging because the agent's observations are not Markov. Recently proposed methods can learn variable-order Markov models of the underlying process but have steep memory requirements and are sensitive to aliasing between observation histories due to sensor noise. This paper proposes dynamic-depth context tree weighting (D2-CTW), a model-learning method that addresses these limitations. D2-CTW dynamically expands a suffix tree while ensuring that the size of the model, but not its depth, remains bounded. We show that D2-CTW approximately matches the performance of state-of-the-art alternatives at stochastic time-series prediction while using at least an order of magnitude less memory. We also apply D2-CTW to model-based RL, showing that, on tasks that require memory of past observations, D2-CTW can learn without prior knowledge of a good state representation, or even the length of history upon which such a representation should depend.
Year
Venue
Field
2017
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017)
State representation,Observable,Markov model,Computer science,Algorithm,Context tree weighting,Aliasing,Artificial intelligence,Suffix tree,Machine learning,Bounded function,Reinforcement learning
DocType
Volume
ISSN
Conference
30
1049-5258
Citations 
PageRank 
References 
0
0.34
0
Authors
3
Name
Order
Citations
PageRank
João V. Messias1264.77
Shimon Whiteson2146099.00
Messias, Joao V.300.34