Sample Self-Selection Using Dual Teacher Networks for Pathological Image Classification with Noisy Labels

计算机科学人工智能机器学习深度学习人工神经网络对偶（语法数字）样品（材料）滤波器（信号处理）过程（计算）模式识别（心理学）噪音（视频）图像（数学）数据挖掘计算机视觉文学类色谱法操作系统艺术化学

作者

Gang Han,Wenping Guo,Haibo Zhang,Jie Jin,Xingli Gan,Xiaoming Zhao

出处

期刊：Computers in Biology and Medicine [Elsevier]
日期：2024-05-01 卷期号：174: 108489-108489 被引量：1

链接

nih.govdoi.org

标识

DOI：10.1016/j.compbiomed.2024.108489

摘要

Deep neural networks (DNNs) involve advanced image processing but depend on large quantities of high-quality labeled data. The presence of noisy data significantly degrades the DNN model performance. In the medical field, where model accuracy is crucial and labels for pathological images are scarce and expensive to obtain, the need to handle noisy data is even more urgent. Deep networks exhibit a memorization effect, they tend to prioritize remembering clean labels initially. Therefore, early stopping is highly effective in managing learning with noisy labels. Previous research has often concentrated on developing robust loss functions or implementing training constraints to mitigate the impact of noisy labels; however, such approaches have frequently resulted in underfitting. We propose using knowledge distillation to slow the learning process of the target network rather than preventing late-stage training from being affected by noisy labels. In this paper, we introduce a data sample self-selection strategy based on early stopping to filter out most of the noisy data. Additionally, we employ the distillation training method with dual teacher networks to ensure the steady learning of the student network. The experimental results show that our method outperforms current state-of-the-art methods for handling noisy labels on both synthetic and real-world noisy datasets. In particular, on the real-world pathological image dataset Chaoyang, the highest classification accuracy increased by 2.39%. Our method leverages the model's predictions based on training history to select cleaner datasets and retrains them using these cleaner datasets, significantly mitigating the impact of noisy labels on model performance.

求助该文献

最长约 10秒，即可获得该文献文件

Sample Self-Selection Using Dual Teacher Networks for Pathological Image Classification with Noisy Labels

今日热心研友