Reinforcement Learning based Fragment-Aware Scheduling for High Utilization HPC Platforms - Citegraph

Paper Info

Title
Reinforcement Learning based Fragment-Aware Scheduling for High Utilization HPC Platforms

Abstract
Due to high capacity and complex scheduling activities, a HPC platform often creates resource fragments with low usability. This paper develops a novel fragment-aware scheduling approach which improves system utilization by fitting elastic lightweight tasks to the fragments of resources dynamically. The new approach employs a threshold to determine the balancing factor between the length of tasks and the degree of granularity of the resource fragments. We employ the PPO reinforcement learning approach to train a neural network that can compute the threshold precisely. With the threshold that is adaptive to the changing system states, the PPO-based scheduler is able to utilize the idle resources and maximize the execution success rate of the tasks.

Year	DOI	Venue
2019	10.1109/TAAI48200.2019.8959932	2019 International Conference on Technologies and Applications of Artiﬁcial Intelligence (TAAI)
Keywords	DocType	ISSN
High-performance computing,malleable task,reinforcement learning,scheduling	Conference	2376-6816
ISBN	Citations	PageRank
978-1-7281-4667-6	0	0.34
References	Authors
5	3

Authors (3 rows)

Cited by (0 rows)

References (5 rows)

Name	Order	Citations	PageRank
Lung-Pin Chen	1	0	0.34
I-Chen Wu	2	208	55.03
Yen-Ling Chang	3	0	0.34

1