计算机科学
云计算
工作量
任务(项目管理)
资源(消歧)
过采样
可靠性(半导体)
分类器(UML)
数据挖掘
重采样
任务分析
人工智能
机器学习
带宽(计算)
工程类
量子力学
功率(物理)
物理
系统工程
操作系统
计算机网络
作者
Jyoti Shetty,Rahul Sajjan,G Shobha
标识
DOI:10.1109/confluence.2019.8776612
摘要
To improve the reliability of the cloud computing system it is important to understand the failure characteristics and to predict failures earlier to avoid it. A statistical analysis of workload data on the cloud provides insights into failure characteristics, which can be used as a cue to improve the reliability of the system. This manuscript discusses a statistical analysis of resource usage data of tasks on the large Google cluster dataset, further failure prediction algorithms are developed to predict the failure. Based on the study, it is observed that there is variation in the resource usage pattern, duration of execution and amount of resource used by a failed task as compared to that of a finished task. Different resampling techniques along with XGboost classifier is used to predict the failure of a task in the highly imbalanced dataset and it is observed that Synthetic minority oversampling along with XGboost predicted the task status with precision of 92% and recall of 94.8%.
科研通智能强力驱动
Strongly Powered by AbleSci AI