Exploiting Correlation in Finite-Armed Structured Bandits. - Citegraph

Paper Info

Title
Exploiting Correlation in Finite-Armed Structured Bandits.

Abstract
We consider a structured Multi-Armed bandit problem in which mean rewards of different arms are related through a hidden parameter. We propose an approach that allows generalization of classical bandit algorithms such as UCB and Thompson sampling to the structured bandit setting. Our approach is based on exploiting the structure in the problem to identify some arms as sub-optimal and pulling them only O(1) times. This results in significant reduction in cumulative regret and in fact our algorithm achieves bounded (i.e., O(1)) regret whenever possible. We empirically demonstrate the superiority of our algorithms via simulations and experiments on the Movielens dataset. Moreover, the problem setting we study in this paper subsumes several previously studied framework such as Global, Regional and Structured bandits with linear rewards.

Year	Venue	Field
2018	arXiv: Machine Learning	Mathematical optimization,Regret,MovieLens,Thompson sampling,Correlation,Mathematics,Bounded function
DocType	Volume	Citations
Journal	abs/1810.08164	0
PageRank	References	Authors
0.34	0	3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Samarth Gupta	1	13	5.60
Gauri Joshi	2	308	29.70
Osman Yagan	3	430	43.65

1