计算机科学
聚类分析
嵌入
机器学习
人工智能
一致性(知识库)
半监督学习
标记数据
趋同(经济学)
模式识别(心理学)
数据挖掘
经济
经济增长
作者
Islam A. Nassar,Munawar Hayat,Ehsan Abbasnejad,Hamid Rezatofighi,Gholamreza Haffari
标识
DOI:10.1109/cvpr52729.2023.01120
摘要
Confidence-based pseudo-labeling is among the dominant approaches in semi-supervised learning (SSL). It relies on including high-confidence predictions made on unlabeled data as additional targets to train the model. We propose Protocon, a novel SSL method aimed at the less-explored label-scarce SSL where such methods usually underperform. Protocon refines the pseudolabels by lever-aging their nearest neighbours' information. The neighbours are identified as the training proceeds using an online clustering approach operating in an embedding space trained via a prototypical loss to encourage well-formed clusters. The online nature of Protocon allows it to utilise the label history of the entire dataset in one training cycle to refine labels in the following cycle without the need to store image embeddings. Hence, it can seamlessly scale to larger datasets at a low cost. Finally, Protocon addresses the poor training signal in the initial phase of training (due to fewer confident predictions) by introducing an auxiliary self-supervised loss. It delivers significant gains and faster convergence over state-of-the-art across 5 datasets, including CIFARs, ImageNet and DomainNet.
科研通智能强力驱动
Strongly Powered by AbleSci AI