Abstract | ||
---|---|---|
Cascading bandit (CB) is a popular model for web search and online advertising. However, the stationary CB model may be too simple to cope with real-world problems, where user preferences may change over time. Considering piecewise-stationary environments, two efficient algorithms, GLRT-CascadeUCB and GLRT-CascadeKL-UCB, are developed. Comparing with existing works, the proposed algorithms: i) are free of change-point-dependent information for choosing parameters; ii) have fewer tuning parameters; iii) improve regret upper bounds. We also show that the proposed algorithms are optimal up to logarithm terms by deriving a minimax lower bound Omega(root NLT) for piecewise-stationary CB. The efficiency of the proposed algorithms is validated through numerical tests on a real-world benchmark dataset. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1109/ICASSP39728.2021.9414506 | 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) |
Keywords | DocType | Citations |
Online Learning, Cascading Bandits, Non-stationary Bandits, Change-point Detection | Conference | 0 |
PageRank | References | Authors |
0.34 | 0 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Lingda Wang | 1 | 0 | 3.04 |
Huozhi Zhou | 2 | 0 | 2.03 |
Bingcong Li | 3 | 2 | 3.75 |
Lav R. Varshney | 4 | 4 | 2.83 |
Zhizhen Zhao | 5 | 42 | 6.36 |