Title
Reinforcement Learning based Fragment-Aware Scheduling for High Utilization HPC Platforms
Abstract
Due to high capacity and complex scheduling activities, a HPC platform often creates resource fragments with low usability. This paper develops a novel fragment-aware scheduling approach which improves system utilization by fitting elastic lightweight tasks to the fragments of resources dynamically. The new approach employs a threshold to determine the balancing factor between the length of tasks and the degree of granularity of the resource fragments. We employ the PPO reinforcement learning approach to train a neural network that can compute the threshold precisely. With the threshold that is adaptive to the changing system states, the PPO-based scheduler is able to utilize the idle resources and maximize the execution success rate of the tasks.
Year
DOI
Venue
2019
10.1109/TAAI48200.2019.8959932
2019 International Conference on Technologies and Applications of Artificial Intelligence (TAAI)
Keywords
DocType
ISSN
High-performance computing,malleable task,reinforcement learning,scheduling
Conference
2376-6816
ISBN
Citations 
PageRank 
978-1-7281-4667-6
0
0.34
References 
Authors
5
3
Name
Order
Citations
PageRank
Lung-Pin Chen100.34
I-Chen Wu220855.03
Yen-Ling Chang300.34