计算机科学
人工智能
计算机视觉
培训(气象学)
对象(语法)
图像处理
模式识别(心理学)
视觉对象识别的认知神经科学
目标检测
图像(数学)
物理
气象学
作者
Yu Zhang,Tao Zhang,Hongyuan Zhu,Zihan Chen,Siya Mi,Xi Peng,Xin Geng
标识
DOI:10.1109/tip.2025.3555073
摘要
Self-supervised visual pre-training models have achieved significant success without employing expensive annotations. Nevertheless, most of these models focus on iconic single-instance datasets (e.g. ImageNet), ignoring the insufficient discriminative representation for non-iconic multi-instance datasets (e.g. COCO). In this paper, we propose a novel Object Adaptive Dense Pre-training (OADP) method to learn the visual representation directly on the multi-instance datasets (e.g., PASCAL VOC and COCO) for dense prediction tasks (e.g., object detection and instance segmentation). We present a novel object-aware and learning-adaptive random view augmentation to focus the contrastive learning to enhance the discrimination of object presentations from large to small scale during different learning stages. Furthermore, the representations across different scale and resolutions are integrated so that the method can learn diverse representations. In the experiment, we evaluated OADP pre-trained on PASCAL VOC and COCO. Results show that our method has better performances than most existing state-of-the-art methods when transferring to various downstream tasks, including image classification, object detection, instance segmentation and semantic segmentation.
科研通智能强力驱动
Strongly Powered by AbleSci AI