计算机科学
调度(生产过程)
任务(项目管理)
分布式计算
任务分析
工作量
工作流程
隐藏物
并行计算
操作系统
动态优先级调度
作者
Masahiro Tanaka,Osamu Tatebe
出处
期刊:International Conference on Cluster Computing
日期:2014-09-01
被引量:7
标识
DOI:10.1109/cluster.2014.6968774
摘要
Workflow scheduling to maximize I/O performance is one of the key issues in data-intensive, many-task computing. In our previous work, we proposed locality-aware workflow scheduling method using the Multi-Constraint Graph Partitioning. In this work, we focus on read performance of input files from the disk cache (buffer cache or page cache on main memory). In order to maximize the disk cache hit rate of input files, a LIFO-order scheduling is effective since created intermediate files may be read soon. However, LIFO policy has a disadvantage of so-called “trailing task problem.” We propose a hybrid scheduling strategy of LIFO and HRF (Highest Rank First). In our strategy, one of two policies is applied depending on the number of highest-rank tasks in the queue to avoid the problem. In addition, scheduling for the overlap of computation and I/O is proposed. We implement our scheduling strategy for the Pwrake workflow system and the Gfarm distributed file system and evaluate it by executing data-intensive workflows using a computer cluster. Our scheduling strategy improves the performance of copyfile workflow by 30% due to increase in disk cache hit rate, and the performance of Montage workflow by 12% due to increase in core utilization.
科研通智能强力驱动
Strongly Powered by AbleSci AI