A Temporal Difference GNG-Based Approach for the State Space Quantization in Reinforcement Learning Environments - Citegraph

Paper Info

Title
A Temporal Difference GNG-Based Approach for the State Space Quantization in Reinforcement Learning Environments

Abstract
The main issue when using reinforcement learning algorithms is how the estimation of the value function can be mapped into states. In very few cases it is possible to use tables but in the majority of cases, the number of states either can be too large to be kept into computer memory or it is computationally too expensive to visit all states. State aggregation models like the self-organizing maps have been used to make this possible by generalizing the input space and mapping the value functions into the states. This paper proposes a new algorithm called TD-GNG that uses the Growing Neural Gas (GNG) network to solve reinforcement learning problems by providing a way to map value functions into states. In experimental comparison against TD-AVQ and uniform discretization in three reinforcement problems, the TD-GNG showed improvements in three aspects, namely, 1) reduction of the dimensionality of the problem, 2) increase the generalization and 3) reduction of the convergence time. Experiments have also show that TD-GNG found a solution using less memory than TD-AVQ and uniform discretization without loosing quality in the policy obtained.

Year	DOI	Venue
2013	10.1109/ICTAI.2013.89	ICTAI
Keywords	Field	DocType
uniform discretization,loosing quality,input space,reinforcement problem,state space quantization,experimental comparison,neural gas,main issue,convergence time,value function,computer memory,reinforcement learning environments,temporal difference gng-based approach,neural nets,learning artificial intelligence	Competitive learning,Temporal difference learning,Computer science,Q-learning,Theoretical computer science,Unsupervised learning,Artificial intelligence,Artificial neural network,State space,Machine learning,Reinforcement learning,Learning classifier system	Conference
ISSN	Citations	PageRank
1082-3409	0	0.34
References	Authors
0	3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Davi Carnauba De Lima Vieira	1	0	0.34
Paulo J. L. Adeodato	2	103	14.27
Paulo Mauricio Goncalves	3	32	3.33

1