过度拟合
计算机科学
人工智能
模式识别(心理学)
机器学习
噪音(视频)
水准点(测量)
概化理论
一般化
集合(抽象数据类型)
人工神经网络
语音识别
数学
统计
大地测量学
程序设计语言
地理
数学分析
图像(数学)
作者
Cheng Cheng,Xiaoyu Liu,Beitong Zhou,Ye Yuan
标识
DOI:10.1109/tii.2022.3229130
摘要
Deep neural networks (DNNs) excel at industrial fault diagnosis. Their performance heavily relies on the quality of human-annotated labels. Due to perception limitations of annotators, industrial time series samples (such as vibration and voltage signals) are frequently mislabeled in several conditions, such as samples with frequency domain feature differences and samples on class borders. Hence, an annotated industrial dataset will inevitably contain noisy labels at a certain level, leading to overfitting and poor generalization of DNNs. In this work, we introduce an industrial noisy label semisupervised learning (INL-SSL) fault diagnosis approach, addressing the problem that a certain number of samples in an industrial dataset are mislabeled. The proposed INL-SSL architecture simultaneously trains two DNNs, which cross-train on each other to filter noisy label errors. In particular, a fitted Gaussian mixture model divides time series samples of each DNN flow into an unlabeled set with samples likely to be noisy and a labeled set with samples likely to be clean. Given the labeled and unlabeled data, we proposed a time series MixMatch semisupervised learning strategy to train the diagnostic model. Ablation study verifies the benefit of the proposed time series augmentation techniques for semisupervised training. Extensive experiments on a benchmark industrial dataset of rolling element bearings (REB) reveal that the INL-SSL outperforms state-of-the-art approaches. On another self-collected REB dataset, the proposed approach also exceeds other comparison methods under noise ratios from 20% to 90%, validating the model's generalizability.
科研通智能强力驱动
Strongly Powered by AbleSci AI