A loss function based parameterless learning automaton scheme. - Citegraph

Paper Info

Title
A loss function based parameterless learning automaton scheme.

Abstract
Learning automaton (LA) is a powerful tool for reinforcement learning in the field of Artificial Intelligence. In the evaluation of LA, it has always been a key issue how to trade off between “accuracy” and “speed”, which substantially involves in parameter tuning. Owing to the environmental randomness and complexity, it universally takes millions of interactions during the process of parameter tuning, bringing about a tremendous expense. To avoid this fatal flaw, in this paper, a novel parameterless learning automaton named LFPLA is proposed. It shows an intriguing property of not relying on manually configured parameters and possesses an ϵ-optimality property in any stationary random environment. A distinctive innovation lies in a newly defined loss function, which replaces the probability vector maintaining in conventional LA. Furthermore, a series of sampling strategies are designed for action selection, and in terms of iteration termination conditions, a sufficiently small threshold is employed. In addition to proving its advantageous performance theoretically by detailed mathematical proofs, we also carried out extensive experiments to illustrate the effectiveness in two-action as well as multi-action benchmark environments, by means of Monte Carlo simulations. The proposed LFPLA converges faster with a higher accuracy than the only parameter-free LA presently: GBLA. Moreover, it is superior to the state of the arts in multi-action LA, especially in complex and confuse environments. Most of all, it embodies a unique and extraordinary benefit with special regard to either tuning cost or interaction cost.

Year	DOI	Venue
2017	10.1016/j.neucom.2017.04.050	Neurocomputing
Keywords	Field	DocType
Reinforcement learning,Learning automaton,Parameterless,Loss function,Monte-Carlo simulations	Monte Carlo method,Computer science,Automaton,Mathematical proof,Artificial intelligence,Sampling (statistics),Probability vector,Action selection,Machine learning,Randomness,Reinforcement learning	Journal
Volume	ISSN	Citations
260	0925-2312	2
PageRank	References	Authors
0.37	15	3

Authors (3 rows)

Cited by (2 rows)

References (15 rows)

Name	Order	Citations	PageRank
Ying Guo	1	56	17.72
Hao Ge	2	18	3.65
Shenghong Li	3	357	47.31

1