滑动窗口协议
计算机科学
云计算
集合预报
随机森林
集成学习
可靠性(半导体)
修剪
分类器(UML)
数据挖掘
人工智能
窗口(计算)
功率(物理)
农学
操作系统
生物
物理
量子力学
作者
Adnan Tahir,Fei Chen,Abdulwahab Ali Almazroi,Nourah Janbi
标识
DOI:10.1016/j.jksuci.2023.101672
摘要
Latent sector errors (LSEs) in disk drives cause significant outages, data loss, and unreliability in large-scale cloud storage systems. Predicting LSEs can help avoid these problems and improve cloud reliability. Ensemble classifiers typically outperform individual classifiers for LSE prediction with high accuracy but can lead to underfitting and incurring additional computational cost, complexity, and time and memory consumption. This research addresses this challenge by proposing a twofold solution: optimizing the ensemble diversity of the resulting Random Forest (RF) classifier through accuracy sliding window-based ensemble pruning (SWEP-RF) and using this pruned ensemble to predict LSEs in cloud storage. SWEP-RF maximizes its lower margin distribution to adapt the RF prediction performance and produce a strong-performing and effective subensemble. Our approach also reduces ensemble size while maintaining high prediction accuracy. We evaluate our algorithm using datasets from Baidu Inc and Backblaze datacenters. Experimental results demonstrate that our approach achieves over 98.6% prediction accuracy, a low false alarm rate (FAR) of 0.003%, and extended meantime to data loss (MTTDL) with lead time in advance (LTA) of up to 383.4 Hrs. and 474.3 Hrs., respectively. SWEP-RF outperforms classical models and current state-of-the-art techniques in prediction accuracy, FAR, MTTDL, processing time, memory consumption, and cloud availability. Our method is a promising solution for enhancing cloud storage reliability through proactive LSE prediction.
科研通智能强力驱动
Strongly Powered by AbleSci AI