Title
Dynamic Computation Offloading With Energy Harvesting Devices: A Hybrid-Decision-Based Deep Reinforcement Learning Approach
Abstract
Mobile-edge computing (MEC) with energy harvesting (EH) is becoming an emerging paradigm to improve the computation experience for the Internet-of-Things (IoT) devices. For a multidevice multiserver MEC system, the frequently varied harvested energy, along with changeable computation task loads and time-varying computation capacities of servers, increase the system’s dynamic. Therefore, each device should learn to make coordinated actions, such as the offloading ratio, local computation capacity, and server selection, to achieve a satisfactory computation quality. Thus, the MEC system with EH devices is highly dynamic and face two challenges: 1) continuous–discrete hybrid action spaces and 2) coordination among devices. To deal with such problem, we propose two deep reinforcement learning (DRL)-based algorithms: 1) hybrid-decision-based actor–critic learning (Hybrid-AC) and 2) multidevice hybrid-AC (MD-Hybrid-AC) for dynamic computation offloading. Hybrid-AC solves the hybrid action space with an improvement of <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">actor–critic</italic> architecture, where the <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">${{actor}}$ </tex-math></inline-formula> outputs continuous actions (offloading ratio and local computation capacity) corresponding to every server, and the <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">${{critic}}$ </tex-math></inline-formula> evaluates the continuous actions and outputs the discrete action of server selection. MD-Hybrid-AC adopts the framework of centralized training with decentralized execution. It learns coordinated decisions by constructing a centralized <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">${{critic}}$ </tex-math></inline-formula> to output server selections, which considers the continuous action policies of all devices. Simulation results show that the proposed algorithms achieve a good balance between consumed time and energy, and have a significant performance improvement compared with baseline offloading policies.
Year
DOI
Venue
2020
10.1109/JIOT.2020.3000527
IEEE Internet of Things Journal
Keywords
DocType
Volume
Servers,Task analysis,Internet of Things,Heuristic algorithms,Computational modeling,Performance evaluation,Optimization
Journal
7
Issue
ISSN
Citations 
10
2327-4662
1
PageRank 
References 
Authors
0.35
0
4
Name
Order
Citations
PageRank
Jing Zhang110.35
Jun Du2215.67
Yuan Shen31151111.52
Jian Wang417521.03