Parametric Bandits: The Generalized Linear Case. - Citegraph

Paper Info

Title
Parametric Bandits: The Generalized Linear Case.

Abstract
We consider structured multi-armed bandit tasks in which the agent is guided by prior structural knowledge that can be exploited to efficiently select the optimal arm(s) in situations where the number of arms is large, or even infinite. We pro- pose a new optimistic, UCB-like, algorithm for non-linearly parameterized bandit problems using the Generalized Linear Model (GLM) framework. We analyze the regret of the proposed algorithm, termed GLM-UCB, obtaining results similar to those recently proved in the literature for the linear regression case. The analysis also highlights a key difficulty of the non-linear case which is solved in GLM-UCB by focusing on the reward space rather than on the parameter space. Moreover, as the actual efficiency of current parameterized bandit algorithms is often deceiving in practice, we provide an asymptotic argument leading to significantly faster convergence. Simulation studies on real data sets illustrate the performance and the robustness of the proposed GLM-UCB approach.

Year	Venue	Field
2010	NIPS	Mathematical optimization,Parameterized complexity,Regret,Computer science,Generalization,Generalized linear model,Parametric statistics,Multi-armed bandit,Parameter space,Artificial intelligence,Machine learning,Finite time
DocType	Citations	PageRank
Conference	61	2.74
References	Authors
7	4

Authors (4 rows)

Cited by (61 rows)

References (7 rows)

Name	Order	Citations	PageRank
Sarah Filippi	1	96	7.66
O. Cappe	2	2112	207.95
Aurélien Garivier	3	547	33.15
Csaba Szepesvári	4	1774	157.06

1