Reward-Biased Maximum Likelihood Estimation For Linear Stochastic Bandits - Citegraph

Paper Info

Title
Reward-Biased Maximum Likelihood Estimation For Linear Stochastic Bandits

Abstract
Modifying the reward-biased maximum likelihood method originally proposed in the adaptive control literature, we propose novel learning algorithms to handle the explore-exploit trade-off in linear bandits problems as well as generalized linear bandits problems. We develop novel index policies that we prove achieve order-optimality, and show that they achieve empirical performance competitive with the state-of-the-art benchmark methods in extensive experiments. The new policies achieve this with low computation time per pull for linear bandits, and thereby resulting in both favorable regret as well as computational efficiency.

Year	Venue	DocType
2021	THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE	Conference
Volume	ISSN	Citations
35	2159-5399	0
PageRank	References	Authors
0.34	0	4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Yu-Heng Hung	1	0	0.68
Ping-Chun Hsieh	2	16	7.01
Xi Liu	3	122	20.80
P. R. Kumar	4	7177	1067.24

1