Smoothed Dual Embedding Control. - Citegraph

Paper Info

Title
Smoothed Dual Embedding Control.

Abstract
We revisit the Bellman optimality equation with Nesterovu0027s smoothing technique and provide a unique saddle-point optimization perspective of the policy optimization problem in reinforcement learning based on Fenchel duality. A new reinforcement learning algorithm, called Smoothed Dual Embedding Control or SDEC, is derived to solve the saddle-point reformulation with arbitrary learnable function approximator. The algorithm bypasses the policy evaluation step in the policy optimization from a principled scheme and is extensible to integrate with multi-step bootstrapping and eligibility traces. We provide a PAC-learning bound on the number of samples needed from one single off-policy sample path, and also characterize the convergence of the algorithm. Finally, we show the algorithm compares favorably to the state-of-the-art baselines on several benchmark control problems.

Year	Venue	Field
2017	arXiv: Learning	Convergence (routing),Saddle,Mathematical optimization,Embedding,Bootstrapping,Computer science,Smoothing,Sample path,Optimization problem,Reinforcement learning
DocType	Volume	Citations
Journal	abs/1712.10285	3
PageRank	References	Authors
0.39	0	7

Authors (7 rows)

Cited by (3 rows)

References (0 rows)

Name	Order	Citations	PageRank
Bo Dai	1	230	34.71
Albert Shaw	2	26	2.45
Lihong Li	3	670	45.28
Xiao, Lin	4	918	53.00
Niao He	5	212	16.52
Jianshu Chen	6	883	52.94
Le Song	7	2437	159.27

1