计算机科学
故障率
可靠性(半导体)
数据集
训练集
可靠性工程
数据挖掘
假警报
数据中心
恒虚警率
热点(地质)
假阳性率
实时计算
人工智能
工程类
操作系统
物理
地质学
功率(物理)
量子力学
地球物理学
作者
Haitao Yang,Zongzhao Li,Huiyuan Qiang,Zhongliang Li,Yaofeng Tu,Yanxi Yang
标识
DOI:10.1109/dsn-s50200.2020.00017
摘要
Disk failure prediction technology has become a hotspot in both academia and industry, which is of great significance to improve the reliability of data center. This paper studies ZTE's disk SMART (Self-Monitoring Analysis and Reporting Technology) data set, trying to predict whether the disk will fail within 5-7 days. In the model training stage, the disk state is classified as normal and failure within 5 days. Then the positive and negative samples are balanced by both over-sampling and under-sampling. Finally, the data set is trained by LSTM (Long Short-Term Memory) and the disk failure prediction model is obtained. In the experiment of ZTE historical data set, the best FDR (Fault Detection Rate) is 97.4% and FAR (False Alarm Rate) is 0.3%. After launching in ZTE data center for 7 months, the best FDR is 94.5%, and the FAR is 0.7%.
科研通智能强力驱动
Strongly Powered by AbleSci AI