Abstract | ||
---|---|---|
We consider a generalization of stochastic bandit problems where the set of arms, X , is allowed to be a generic topological space. We constraint the mean-payoff function with a dissimilarity function over X in a way that is more general than Lipschitz. We construct an arm selection policy whose regret improves upon previous result for a large class of problems. In particular, our results imply that if X is the unit hypercube in a Euclidean space and the mean-payoff function has a finite number of global maxima around which the behavior of the function is locally H¨ older with a known exponent, then the expected regret is bounded up to a logarithmic factor by p n, i.e., the rate of the growth of the regret is independent of the dimension of the space. Moreover, we prove the minimax optimality of our algorithm for the class of mean-payoff functions we consider. |
Year | Venue | Keywords |
---|---|---|
2008 | NIPS | euclidean space,topological space |
Field | DocType | Citations |
Set function,Function space,Minimax,Artificial intelligence,Lipschitz continuity,Discrete mathematics,Combinatorics,Mathematical optimization,Topological space,Regret,Euclidean space,Machine learning,Mathematics,Bounded function | Conference | 43 |
PageRank | References | Authors |
3.51 | 5 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Sébastien Bubeck | 1 | 1472 | 92.28 |
Rémi Munos | 2 | 2240 | 157.06 |
Gilles Stoltz | 3 | 351 | 31.53 |
Csaba Szepesvári | 4 | 1774 | 157.06 |