Correlation Minimizing Replay Memory in Temporal-Difference Reinforcement Learning - Citegraph

Paper Info

Title
Correlation Minimizing Replay Memory in Temporal-Difference Reinforcement Learning

Abstract
Online reinforcement learning agents are now able to process an increasing amount of data which makes their approximation and compression into value functions a more demanding task. To improve approximation, thus the learning process itself, it has been proposed to select randomly a mini-batch of the past experiences that are stored in the replay memory buffer to be replayed at each learning step. In this work, we present an algorithm that classifies and samples the experiences into separate contextual memory buffers using an unsupervised learning technique. This allows each new experience to be associated to a mini-batch of the past experiences that are not from the same contextual buffer as the current one, thus further reducing the correlation between experiences. Experimental results show that the correlation minimizing sampling improves over Q-learning algorithms with uniform sampling, and that a significant improvement can be observed when coupled with the sampling methods that prioritize on the experience temporal difference error.

Year	DOI	Venue
2020	10.1016/j.neucom.2020.02.004	Neurocomputing
Keywords	DocType	Volume
Reinforcement learning,Temporal-difference learning,Replay memory,Artificial neural networks	Journal	393
ISSN	Citations	PageRank
0925-2312	0	0.34
References	Authors
0	2

Authors (2 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Mirza Ramicic	1	2	1.41
Andrea Bonarini	2	623	76.73

1