High-Probability Regret Bounds for Bandit Online Linear Optimization - Citegraph

Paper Info

Title
High-Probability Regret Bounds for Bandit Online Linear Optimization

Abstract
We present a modification of the algorithm of Dani et al. (8) for the online linear optimization prob- lem in the bandit setting, which with high proba- bility has regret at mostO ( p T ) against an adap- tive adversary. This improves on the previous al- gorithm (8) whose regret is bounded in expecta- tion against an oblivious adversary. We obtain the same dependence on the dimension (n3=2) as that exhibited by Dani et al. The results of this paper rest firmly on those of (8) and the remark- able technique of Auer et al. (2) for obtaining high- probability bounds via optimistic estimates. This paper answers an open question: it eliminates the gap between the high-probability bounds obtained in the full-information vs bandit settings.

Year	Venue	Keywords
2008	COLT	linear optimization
Field	DocType	Citations
Linear optimization problem,Mathematical optimization,Regret,Computer science,Linear programming,Adversary,Bounded function	Conference	13
PageRank	References	Authors
1.34	7	6

Authors (6 rows)

Cited by (13 rows)

References (7 rows)

Name	Order	Citations	PageRank
Peter L. Bartlett	1	5482	1039.97
Varsha Dani	2	432	40.05
Thomas P. Hayes	3	659	54.21
Sham Kakade	4	4365	282.77
Alexander Rakhlin	5	732	62.84
Ambuj Tewari	6	1371	99.22

1