计算机科学
调度(生产过程)
分布式计算
数学优化
数学
作者
Guannan Wu,Tao Zhang,Yunhui Qin,Jinliang Jiang
摘要
Slurm (Simple Linux Utility for Resource Management) is a popular open-source cluster job scheduling system. In scenarios involving multiple queues and a large number of scientific computing tasks, the Slurm scheduling system faces challenges such as improper queue depth settings and uneven job loads in queues. This research aims to optimize the scheduling performance of the Slurm scheduling system in scenarios where resources are shared among multiple queues. By forecasting future CPU load increases based on the historical CPU utilization of queues, a window period is identified. Using the forecasted results, a dynamic adjustment method for queue priorities is introduced. This method aims to elevate job priorities in specific queues when a significant number of jobs are queued, ensuring prompt execution. Additionally, this study involves a dynamic adjustment strategy for queue depth to address issues arising from inappropriate queue depth settings, where resources in the queue are idle, but queued jobs face delays in scheduling. Experimental results demonstrate that this approach enhances the scheduling performance of the Slurm system in multi-queue scenarios, improving cluster resource utilization and better meeting the diverse demands of users.
科研通智能强力驱动
Strongly Powered by AbleSci AI