Title
Parametric Bandits: The Generalized Linear Case.
Abstract
We consider structured multi-armed bandit tasks in which the agent is guided by prior structural knowledge that can be exploited to efficiently select the optimal arm(s) in situations where the number of arms is large, or even infinite. We pro- pose a new optimistic, UCB-like, algorithm for non-linearly parameterized bandit problems using the Generalized Linear Model (GLM) framework. We analyze the regret of the proposed algorithm, termed GLM-UCB, obtaining results similar to those recently proved in the literature for the linear regression case. The analysis also highlights a key difficulty of the non-linear case which is solved in GLM-UCB by focusing on the reward space rather than on the parameter space. Moreover, as the actual efficiency of current parameterized bandit algorithms is often deceiving in practice, we provide an asymptotic argument leading to significantly faster convergence. Simulation studies on real data sets illustrate the performance and the robustness of the proposed GLM-UCB approach.
Year
Venue
Field
2010
NIPS
Mathematical optimization,Parameterized complexity,Regret,Computer science,Generalization,Generalized linear model,Parametric statistics,Multi-armed bandit,Parameter space,Artificial intelligence,Machine learning,Finite time
DocType
Citations 
PageRank 
Conference
61
2.74
References 
Authors
7
4
Name
Order
Citations
PageRank
Sarah Filippi1967.66
O. Cappe22112207.95
Aurélien Garivier354733.15
Csaba Szepesvári41774157.06