计算机科学
分布式计算
调度(生产过程)
作业车间调度
流式处理
动态优先级调度
实时计算
作者
Wenxin Li,Duowen Liu,Kai Chen,Keqiu Li,Heng Qi
出处
期刊:IEEE Transactions on Parallel and Distributed Systems
[Institute of Electrical and Electronics Engineers]
日期:2021-08-01
卷期号:32 (8): 2021-2034
标识
DOI:10.1109/tpds.2021.3051059
摘要
Low latency stream processing on large clusters consisting of hundreds to thousands of servers is an increasingly important challenge. A crucial barrier to tackling this challenge is stragglers , i.e., tasks that are significantly straggling behind others in processing the stream data. However, prior straggler mitigation solutions have significant limitations. They balance streaming workloads among tasks but may incur imbalanced backlogs when the workloads exhibit variance, causing stragglers as well. Fortunately, we observe that carefully scheduling the outgoing tuples of different tasks can yield benefits for balancing backlogs, and thus avoids stragglers. To this end, we present Hone , a tuple scheduler that aims to minimize the maximum queue backlog of all tasks over time. Hone leverages an online Largest-Backlog-First ( LBF ) algorithm with a provable good competitive ratio to perform efficient tuple scheduling. We have implemented Hone based on Apache Storm and evaluated it extensively via both simulations and testbed experiments. Our results show that under the same workload balancing strategy– shuffle grouping , Hone outperforms the original Storm significantly, with the end-to-end tuple processing latency reduced by 78.7 percent on average.
科研通智能强力驱动
Strongly Powered by AbleSci AI