Analysis Of Thompson Sampling For Stochastic Sleeping Bandits - Citegraph

Paper Info

Title
Analysis Of Thompson Sampling For Stochastic Sleeping Bandits

Abstract
We study a variant of the stochastic multi-armed bandit problem where the set of available arms varies arbitrarily with time (also known as the sleeping bandit problem). We focus on the Thompson Sampling algorithm and consider a regret notion defined with respect to the best available arm. Our main result is an O(log T) regret bound for Thompson Sampling, which generalizes a similar bound known for this algorithm from the classical bandit setting. Our bound also matches (up to constants) the best-known lower bound for the sleeping bandit problem. We show via simulations that Thompson Sampling outperforms the UCB-style AUER algorithm for the sleeping bandit problem.

Year	Venue	Field
2017	CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI2017)	Computer science,Thompson sampling,Artificial intelligence,Machine learning
DocType	Citations	PageRank
Conference	1	0.36
References	Authors
8	5

Authors (5 rows)

Cited by (1 rows)

References (8 rows)

Name	Order	Citations	PageRank
Aritra Chatterjee	1	1	0.36
Ganesh Ghalme	2	1	1.71
Shweta Jain	3	15	2.36
Vaish, Rohit	4	34	3.23
Y. Narahari	5	699	98.97

1