Abstract | ||
---|---|---|
In reinforcement learning (RL), temporal abstraction still remains as an important and unsolved problem. The options framework provided clues to temporal abstraction in the RL, and the option-critic architecture elegantly solved the two problems of finding options and learning RL agents in an end-to-end manner. However, it is necessary to examine whether the options learned through this method play a mutually exclusive role. In this paper, we propose a Hellinger distance regularizer, a method for disentangling options. In addition, we will shed light on various indicators from the statistical point of view to compare with the options learned through the existing option-critic architecture. |
Year | Venue | DocType |
---|---|---|
2019 | arXiv: Learning | Journal |
Volume | Citations | PageRank |
abs/1904.06887 | 0 | 0.34 |
References | Authors | |
0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Minsung Hyun | 1 | 0 | 1.01 |
Junyoung Choi | 2 | 31 | 5.93 |
Nojun Kwak | 3 | 862 | 63.79 |