Title
A loss function based parameterless learning automaton scheme.
Abstract
Learning automaton (LA) is a powerful tool for reinforcement learning in the field of Artificial Intelligence. In the evaluation of LA, it has always been a key issue how to trade off between “accuracy” and “speed”, which substantially involves in parameter tuning. Owing to the environmental randomness and complexity, it universally takes millions of interactions during the process of parameter tuning, bringing about a tremendous expense. To avoid this fatal flaw, in this paper, a novel parameterless learning automaton named LFPLA is proposed. It shows an intriguing property of not relying on manually configured parameters and possesses an ϵ-optimality property in any stationary random environment. A distinctive innovation lies in a newly defined loss function, which replaces the probability vector maintaining in conventional LA. Furthermore, a series of sampling strategies are designed for action selection, and in terms of iteration termination conditions, a sufficiently small threshold is employed. In addition to proving its advantageous performance theoretically by detailed mathematical proofs, we also carried out extensive experiments to illustrate the effectiveness in two-action as well as multi-action benchmark environments, by means of Monte Carlo simulations. The proposed LFPLA converges faster with a higher accuracy than the only parameter-free LA presently: GBLA. Moreover, it is superior to the state of the arts in multi-action LA, especially in complex and confuse environments. Most of all, it embodies a unique and extraordinary benefit with special regard to either tuning cost or interaction cost.
Year
DOI
Venue
2017
10.1016/j.neucom.2017.04.050
Neurocomputing
Keywords
Field
DocType
Reinforcement learning,Learning automaton,Parameterless,Loss function,Monte-Carlo simulations
Monte Carlo method,Computer science,Automaton,Mathematical proof,Artificial intelligence,Sampling (statistics),Probability vector,Action selection,Machine learning,Randomness,Reinforcement learning
Journal
Volume
ISSN
Citations 
260
0925-2312
2
PageRank 
References 
Authors
0.37
15
3
Name
Order
Citations
PageRank
Ying Guo15617.72
Hao Ge2183.65
Shenghong Li335747.31