A Semisupervised Approach for Industrial Anomaly Detection via Self-Adaptive Clustering
异常检测
聚类分析
计算机科学
人工智能
异常(物理)
数据挖掘
物理
凝聚态物理
作者
Xiaoxue Ma,Jacky Keung,Pinjia He,Yan Xiao,Xiao Yu,Yishu Li
出处
期刊:IEEE Transactions on Industrial Informatics [Institute of Electrical and Electronics Engineers] 日期:2023-05-29卷期号:20 (2): 1687-1697被引量:5
标识
DOI:10.1109/tii.2023.3280246
摘要
With the rapid development of the Industrial Internet of Things, log-based anomaly detection has become vital for smart industrial construction that has prompted many researchers to contribute. To detect anomalies based on log data, semisupervised approaches stand out from supervised and unsupervised approaches because they only require a portion of labeled data and are relatively stable. However, the state-of-the-art semisupervised approaches still suffer from two main problems: manual parameter setting and unsatisfactory performance with high false positives. We propose AdaLog, an integrated semisupervised approach based on self-adaptive clustering, for industrial anomaly detection. In particular, the clustering step performs automatic label probability estimation by distinguishing 12 situations so that the label probability of each unlabeled data can be carefully calculated, leading to high accuracy. In addition, AdaLog employs a pretrained model to learn contextual information comprehensively and a transformer-based model to detect anomalies efficiently. To alleviate class imbalance, an undersampling method is incorporated. The results on three popular datasets demonstrate that AdaLog significantly outperforms three state-of-the-art semisupervised approaches by 17.8%–2489.8% on average in terms of F1-score, and is even superior to two supervised approaches in most cases with average improvements of 10.9%–23.8%.