Best of both worlds: Stochastic & adversarial best-arm identification. - Citegraph

Paper Info

Title
Best of both worlds: Stochastic & adversarial best-arm identification.

Abstract
We study a version of the bandit best-arm identification problem with potentially adversarial rewards. A simple random uniform strategy obtains the optimal rate of error in the adversarial scenario. However, this type of strategy is sub-optimal when the rewards are sampled stochastically. Can we design a learner that performs optimally in both the stochastic and adversarial problems while not being aware of the nature of the rewards? First, we show that designing such learner is impossible in general: to be robust to adversarial rewards, we can only guarantee optimal rates of error on a subset of the stochastic problems. We show a lower bound that characterizes the optimal rate in stochastic problems if the strategy is constrained to be robust to adversarial rewards. Finally, we design a simple parameter-free algorithm and show that its probability of error matches (up to log factors) the lower bound in stochastic problems, and it is also robust to adversarial problems.

Year	Venue	Field
2018	COLT	Mathematical optimization,Simple random sample,Computer science,Upper and lower bounds,Probability of error,Parameter identification problem,Adversarial system
DocType	Citations	PageRank
Conference	1	0.35
References	Authors
0	5

Authors (5 rows)

Cited by (1 rows)

References (0 rows)

Name	Order	Citations	PageRank
Yasin Abbasi-Yadkori	1	273	23.80
Peter L. Bartlett	2	5482	1039.97
Victor Gabillon	3	116	9.51
Alan Malek	4	15	2.41
Michal Valko	5	212	37.24

1