Title | ||
---|---|---|
KL-UCB-switch: optimal regret bounds for stochastic bandits from both a distribution-dependent and a distribution-free viewpoints. |
Abstract | ||
---|---|---|
In the context of K–armed stochastic bandits with distribution only assumed to be supported by [0, 1], we introduce a new algorithm, KL-UCB-switch, and prove that is enjoys simultaneously a distribution-free regret bound of optimal order √ KT and a distribution-dependent regret bound of optimal order as well, that is, matching the κ ln T lower bound by Lai and Robbins (1985) and Burnetas and Katehakis (1996). |
Year | Venue | Field |
---|---|---|
2018 | arXiv: Machine Learning | Discrete mathematics,Confidence bounds,Mathematical optimization,Regret minimization,Regret,Upper and lower bounds,Viewpoints,Empirical likelihood,Mathematics,Bounded function |
DocType | Volume | Citations |
Journal | abs/1805.05071 | 0 |
PageRank | References | Authors |
0.34 | 7 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Aurélien Garivier | 1 | 547 | 33.15 |
Hédi Hadiji | 2 | 0 | 1.01 |
Pierre Ménard | 3 | 19 | 6.45 |
Gilles Stoltz | 4 | 351 | 31.53 |