Abstract | ||
---|---|---|
Learning automaton (LA) is a reinforcement learning unit that learns the optimal action in a stochastic environment. Great efforts have been made to improve the performance of LA in the environments that provide only reward or penalty. However, in many practical scenarios, the feedback from the environment splits into multiple levels. The later environment is recognized by the LA community as the Q-model. This paper studies the LA in Q-model environments, whose study has been scanty. We propose a novel Bayesian inference-based LA that is capable of functioning in Q-model environments, BILAML. We utilize Bayesian inference to estimate the environment’s response to each action. Then, KL divergence metric is adopted for adaptive decision-making. The BILAML scheme is proved to be ��-optimal and is evaluated to be superior to established LA frameworks by comprehensive experiments. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1007/s10489-021-02230-8 | Applied Intelligence |
Keywords | DocType | Volume |
Learning automaton, Bayesian inference, Q-model environments | Journal | 51 |
Issue | ISSN | Citations |
10 | 0924-669X | 0 |
PageRank | References | Authors |
0.34 | 0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Chong Di | 1 | 0 | 1.69 |
Fangqi Li | 2 | 4 | 3.08 |
Shenghong Li | 3 | 0 | 4.73 |
Jianwei Tian | 4 | 0 | 0.68 |