Title
Communication-Efficient Cooperative Contextual Bandit and Its Application to Wi-Fi BSS Selection
Abstract
In this study, we extended a contextual bandit algorithm, LinUCB, to facilitate cooperative learning of an optimal strategy with intermittent information sharing. We then applied the algorithm to a Wi-Fi basic service set (BSS) selection problem. The BSS selection problem, in which a mobile user selects a BSS that provides maximal throughput based on information observed, still remains a topic of debate. Reinforcement learning, specifically the multi-armed bandit algorithm, enables mobile users to learn an optimal strategy for selecting a good BSS in their environments. This paper proposes a cooperative contextual bandit algorithm, called Cooperative LinUCB (CoopLinUCB), to address the BSS selection problem. Conventional cooperative bandit algorithms require to share experiences such as context and payoffs for every action, and the information sharing increases communication costs. The proposed algorithm enables mobile users to learn strategies using a limited amount of information that is shared intermittently. The learned strategies are then guaranteed to be equivalent to the strategies that are updated using user experience information. Simulation evaluation based on measured throughput and received signal strength indication fingerprint demonstrates that CoopLinUCB-based BSS selection learns a BSS selection strategy faster and reduces the cumulative regret compared to BSS selection without cooperation.
Year
DOI
Venue
2020
10.1109/CCNC46108.2020.9045348
2020 IEEE 17th Annual Consumer Communications & Networking Conference (CCNC)
Keywords
DocType
ISSN
optimal strategy,intermittent information sharing,BSS selection problem,mobile user,reinforcement learning,multiarmed bandit algorithm,learned strategies,user experience information,CoopLinUCB-based BSS selection,Wi-Fi BSS selection,cooperative contextual bandit algorithm,Wi-Fi basic service set selection problem,communication-efficient cooperative contextual bandit,cooperative learning,received signal strength indication fingerprint
Conference
2331-9852
ISBN
Citations 
PageRank 
978-1-7281-3894-7
0
0.34
References 
Authors
6
6
Name
Order
Citations
PageRank
Taichi Sakakibara100.34
Takayuki Nishio210638.21
Akihito Taya302.03
Masahiro Morikura418463.42
Koji Yamamoto513545.58
Toshihisa Nabetani600.34