Title
Adaptive Adjacency Kanerva Coding for Memory-Constrained Reinforcement Learning.
Abstract
When encountering continuous, or very large domains, using a compact representation of the state space is preferable for practical reinforcement learning (RL). This approach can reduce the size of the state space and enable generalization by relating similar or neighboring states. However, many state abstraction techniques cannot achieve satisfactory approximation quality in the presence of limited memory resources, while expert state space shaping can be costly and usually does not scale well. We have investigated the principle of Sparse Distributed Memories (SDMs) and applied it as a function approximator to learn good policies for RL. This paper describes a new approach, adaptive adjacency in SDMs, that is capable of representing very large continuous state spaces with a very small collection of prototype states. This algorithm enhances an SDMs architecture to allow on-line, dynamically-adjusting generalization to assigned memory resources to provide high-quality approximation. The memory size and memory allocation no longer need to be manually assigned before and during RL. Based on our results, this approach performs well both in terms of approximation quality and memory usage. The superior performance of this approach over existing SDMs and tile coding (CMACs) is demonstrated through a comprehensive simulation study in two classic domains, Mountain Car with 2 dimensions and Hunter-Prey with 5 dimensions. Our empirical evaluations demonstrate that the adaptive adjacency approach can be used to efficiently approximate value functions with limited memories, and that the approach scales well across tested domains with continuous, large-scale state spaces.
Year
Venue
Field
2018
MLDM
Adjacency list,Architecture,Abstraction,Function approximation,Computer science,Coding (social sciences),Theoretical computer science,Memory management,Artificial intelligence,State space,Machine learning,Reinforcement learning
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
13
2
Name
Order
Citations
PageRank
Wei Li161.57
Waleed Meleis215718.29