Title
An Optimal Algorithm for Stochastic Matroid Bandit Optimization.
Abstract
The selection of leaders in leader-follower multi-agent systems can be naturally formulated as a matroid optimization problem. In this paper, we investigate the online and stochastic version of such a problem, where in each iteration or round, we select a set of leaders and then observe a random realization of the corresponding reward, i.e., of the system performance. This problem is referred to as a stochastic matroid bandit, a variant of combinatorial multi-armed bandit problems where the underlying combinatorial structure is a matroid. We consider semi-bandit feedback and Bernoulli rewards, and derive a tight and problem-dependent lower bound on the regret of any consistent algorithm. We propose KL-OSM, a computationally efficient algorithm that exploits the matroid structure. We derive a finite-time upper bound of the regret of KL-OSM that improves the performance guarantees of existing algorithms. This upper bound actually matches our lower bound, i.e., KL-OSM is asymptotically optimal. Numerical experiments attest that KL-OSM outperforms state-of-the-art algorithms in practice, and the difference in some cases is significant.
Year
DOI
Venue
2016
10.5555/2936924.2937004
AAMAS
Keywords
Field
DocType
Multi-Armed Bandits,Online Learning,Combinatorial Optimization,Matroids,Regret Analysis
Upper and lower bounds,Computer science,Matroid partitioning,Artificial intelligence,Weighted matroid,Optimization problem,Asymptotically optimal algorithm,Matroid,Mathematical optimization,Oriented matroid,Algorithm,Combinatorial optimization,Machine learning
Conference
Citations 
PageRank 
References 
2
0.39
11
Authors
2
Name
Order
Citations
PageRank
Sadegh Talebi16411.41
A. Proutiére267351.18