Similarity-Aware Kanerva Coding For On-Line Reinforcement Learning - Citegraph

Paper Info

Title
Similarity-Aware Kanerva Coding For On-Line Reinforcement Learning

Abstract
A major challenge in reinforcement learning (RL) is use of a tabular representation to represent learned policies with a large number of states or state-action pairs. Function approximation is a promising tool to overcome this deficiency. This approach uses parameterized functions instead of a table to represent learned knowledge and enables generalization. However, existing schemes cannot solve realistic RL problems, with their rapidly increasing demands for approximating accuracy and efficiency.In this paper, we extend the architecture of Sparse Distributed Memories (SDMs) and propose a novel on-line methodology, similarity-aware Kanerva coding (SAK), that closely represents the learned knowledge for very large-scale problems with significantly fewer parameterized components. SAK directly measures the state variables' real distances in all dimensions and reformulates a new state similarity metric with an improved definition of state closeness. As a result, our scheme accurately distributes and generalizes knowledge among related states. We further enhance SAK's efficiency by allowing a limited number of prototype states that have certain similarities to be activated for value approximation so that the risk of over-generalization is hindered. In addition, SAK eliminates size tuning and prototype reallocation for the prototype set, resulting in not only broadened scalability but also significant savings in the amount of necessary prototypes and computational overhead needed for RL. Our extensive experimental results show that SAK achieves more than 48% improvements over existing schemes in learning quality, and reveal that SAK is able to consistently learn good policies for RL with small overhead and short training times, even given roughly tuned scheme parameters.

Year	DOI	Venue
2018	10.1145/3271553.3271609	PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON VISION, IMAGE AND SIGNAL PROCESSING (ICVISP 2018)
Keywords	DocType	Citations
Reinforcement learning, function approximation, Kanerva coding	Conference	0
PageRank	References	Authors
0.34	0	2

Authors (2 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Wei Li	1	0	0.34
Waleed Meleis	2	157	18.29

1