计算机科学
人工智能
噪音(视频)
分类器(UML)
机器学习
模式识别(心理学)
域适应
特征(语言学)
噪声数据
人工神经网络
数据挖掘
哲学
语言学
图像(数学)
作者
Peng-Fei Zhang,Zi Huang,Guangdong Bai,Xin-Shun Xu
标识
DOI:10.1145/3503161.3548053
摘要
Data annotations obtained for supervised learning often suffer from label noise, which would inevitably incur unreliable deep neural networks. Existing solutions to this problem typically limit the scope to instance-independent label noise. Due to the high illegibility of data and the inexperience of annotators, instance-dependent noise has also been widely observed, however, not being investigated. In this paper, we propose a novel \underlineIDE ntify and \underlineAL ign (IDEAL) methodology, which aims to eliminate the feature distribution shift raised by a broad spectrum of noise patterns. The proposed model is capable of learning noise-resilient feature representations, thereby correctly predicting data instances. More specifically, we formulate the robust learning against noisy labels as a domain adaptation problem by identifying noisy data (i.e., data samples with incorrect labels) and clean data from the dataset as two domains and minimizing their domain discrepancy in the feature space. In this framework, a high-order-ensemble adaptation network is devised to provide high-confidence predictions, according to which a specific criterion is defined for differentiating clean and noisy data. A new metric based on data augmentation is designed to measure the discrepancy between the clean and noisy domains. Along with a min-max learning strategy between the feature encoder and the classifier on the discrepancy, the domain gap will be bridged, which encourages a noise-resilient model. In-depth theoretical analysis and extensive experiments on widely-used benchmark datasets demonstrate the effectiveness of the proposed method.
科研通智能强力驱动
Strongly Powered by AbleSci AI