Abstract | ||
---|---|---|
We propose novel algorithms with first- and second-order regret bounds for adversarial linear bandits. These regret bounds imply that our algorithms perform well when there is an action achieving a small cumulative loss or the loss has a small variance. In addition, we need only assumptions weaker than those of existing algorithms; our algorithms work on discrete action sets as well as continuous ones without a priori knowledge about losses, and they run efficiently if a linear optimization oracle for the action set is available. These results are obtained by combining optimistic online optimization, continuous multiplicative weight update methods, and a novel technique that we refer to as distribution truncation. We also show that the regret bounds of our algorithms are tight up to polylogarithmic factors. |
Year | Venue | DocType |
---|---|---|
2020 | NIPS 2020 | Conference |
Volume | Citations | PageRank |
33 | 0 | 0.34 |
References | Authors | |
0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Shinji Ito | 1 | 8 | 6.26 |
Shuichi Hirahara | 2 | 3 | 7.48 |
Tasuku Soma | 3 | 11 | 2.71 |
Yuichi Yoshida | 4 | 469 | 44.88 |