Title
KL-UCB-switch: optimal regret bounds for stochastic bandits from both a distribution-dependent and a distribution-free viewpoints.
Abstract
In the context of K–armed stochastic bandits with distribution only assumed to be supported by [0, 1], we introduce a new algorithm, KL-UCB-switch, and prove that is enjoys simultaneously a distribution-free regret bound of optimal order √ KT and a distribution-dependent regret bound of optimal order as well, that is, matching the κ ln T lower bound by Lai and Robbins (1985) and Burnetas and Katehakis (1996).
Year
Venue
Field
2018
arXiv: Machine Learning
Discrete mathematics,Confidence bounds,Mathematical optimization,Regret minimization,Regret,Upper and lower bounds,Viewpoints,Empirical likelihood,Mathematics,Bounded function
DocType
Volume
Citations 
Journal
abs/1805.05071
0
PageRank 
References 
Authors
0.34
7
4
Name
Order
Citations
PageRank
Aurélien Garivier154733.15
Hédi Hadiji201.01
Pierre Ménard3196.45
Gilles Stoltz435131.53