Title | ||
---|---|---|
Communication-Efficient Cooperative Contextual Bandit and Its Application to Wi-Fi BSS Selection |
Abstract | ||
---|---|---|
In this study, we extended a contextual bandit algorithm, LinUCB, to facilitate cooperative learning of an optimal strategy with intermittent information sharing. We then applied the algorithm to a Wi-Fi basic service set (BSS) selection problem. The BSS selection problem, in which a mobile user selects a BSS that provides maximal throughput based on information observed, still remains a topic of debate. Reinforcement learning, specifically the multi-armed bandit algorithm, enables mobile users to learn an optimal strategy for selecting a good BSS in their environments. This paper proposes a cooperative contextual bandit algorithm, called Cooperative LinUCB (CoopLinUCB), to address the BSS selection problem. Conventional cooperative bandit algorithms require to share experiences such as context and payoffs for every action, and the information sharing increases communication costs. The proposed algorithm enables mobile users to learn strategies using a limited amount of information that is shared intermittently. The learned strategies are then guaranteed to be equivalent to the strategies that are updated using user experience information. Simulation evaluation based on measured throughput and received signal strength indication fingerprint demonstrates that CoopLinUCB-based BSS selection learns a BSS selection strategy faster and reduces the cumulative regret compared to BSS selection without cooperation. |
Year | DOI | Venue |
---|---|---|
2020 | 10.1109/CCNC46108.2020.9045348 | 2020 IEEE 17th Annual Consumer Communications & Networking Conference (CCNC) |
Keywords | DocType | ISSN |
optimal strategy,intermittent information sharing,BSS selection problem,mobile user,reinforcement learning,multiarmed bandit algorithm,learned strategies,user experience information,CoopLinUCB-based BSS selection,Wi-Fi BSS selection,cooperative contextual bandit algorithm,Wi-Fi basic service set selection problem,communication-efficient cooperative contextual bandit,cooperative learning,received signal strength indication fingerprint | Conference | 2331-9852 |
ISBN | Citations | PageRank |
978-1-7281-3894-7 | 0 | 0.34 |
References | Authors | |
6 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Taichi Sakakibara | 1 | 0 | 0.34 |
Takayuki Nishio | 2 | 106 | 38.21 |
Akihito Taya | 3 | 0 | 2.03 |
Masahiro Morikura | 4 | 184 | 63.42 |
Koji Yamamoto | 5 | 135 | 45.58 |
Toshihisa Nabetani | 6 | 0 | 0.34 |