计算机科学
人工智能
机器学习
深度学习
人工神经网络
对偶(语法数字)
样品(材料)
滤波器(信号处理)
过程(计算)
模式识别(心理学)
噪音(视频)
图像(数学)
数据挖掘
计算机视觉
文学类
色谱法
操作系统
艺术
化学
作者
Gang Han,Wenping Guo,Haibo Zhang,Jie Jin,Xingli Gan,Xiaoming Zhao
标识
DOI:10.1016/j.compbiomed.2024.108489
摘要
Deep neural networks (DNNs) involve advanced image processing but depend on large quantities of high-quality labeled data. The presence of noisy data significantly degrades the DNN model performance. In the medical field, where model accuracy is crucial and labels for pathological images are scarce and expensive to obtain, the need to handle noisy data is even more urgent. Deep networks exhibit a memorization effect, they tend to prioritize remembering clean labels initially. Therefore, early stopping is highly effective in managing learning with noisy labels. Previous research has often concentrated on developing robust loss functions or implementing training constraints to mitigate the impact of noisy labels; however, such approaches have frequently resulted in underfitting. We propose using knowledge distillation to slow the learning process of the target network rather than preventing late-stage training from being affected by noisy labels. In this paper, we introduce a data sample self-selection strategy based on early stopping to filter out most of the noisy data. Additionally, we employ the distillation training method with dual teacher networks to ensure the steady learning of the student network. The experimental results show that our method outperforms current state-of-the-art methods for handling noisy labels on both synthetic and real-world noisy datasets. In particular, on the real-world pathological image dataset Chaoyang, the highest classification accuracy increased by 2.39%. Our method leverages the model's predictions based on training history to select cleaner datasets and retrains them using these cleaner datasets, significantly mitigating the impact of noisy labels on model performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI