Title | ||
---|---|---|
Damped Anderson Mixing for Deep Reinforcement Learning: Acceleration, Convergence, and Stabilization. |
Abstract | ||
---|---|---|
Anderson mixing has been heuristically applied to reinforcement learning (RL) algorithms for accelerating convergence and improving the sampling efficiency of deep RL. Despite its heuristic improvement of convergence, a rigorous mathematical justification for the benefits of Anderson mixing in RL has not yet been put forward. In this paper, we provide deeper insights into a class of acceleration schemes built on Anderson mixing that improve the convergence of deep RL algorithms. Our main results establish a connection between Anderson mixing and quasi-Newton methods and prove that Anderson mixing increases the convergence radius of policy iteration schemes by an extra contraction factor. The key focus of the analysis roots in the fixed-point iteration nature of RL. We further propose a stabilization strategy by introducing a stable regularization term in Anderson mixing and a differentiable, non-expansive MellowMax operator that can allow both faster convergence and more stable behavior. Extensive experiments demonstrate that our proposed method enhances the convergence, stability, and performance of RL algorithms. |
Year | Venue | DocType |
---|---|---|
2021 | Annual Conference on Neural Information Processing Systems | Conference |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
8 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ke Sun | 1 | 0 | 0.68 |
Yafei Wang | 2 | 0 | 1.01 |
Yi Liu | 3 | 1 | 1.03 |
Yingnan Zhao | 4 | 0 | 0.68 |
Bo Pan | 5 | 0 | 0.68 |
Shangling Jui | 6 | 1 | 3.05 |
Bei Jiang | 7 | 7 | 2.84 |
Linglong Kong | 8 | 42 | 11.37 |