计算机科学
初始化
人工智能
编码器
标记数据
编码(集合论)
监督学习
经济短缺
地点
机器学习
模式识别(心理学)
集合(抽象数据类型)
哲学
程序设计语言
操作系统
人工神经网络
政府(语言学)
语言学
作者
Suxian Xiang,Yue Hao,Chenxi Huang,Ping Li
标识
DOI:10.1109/icassp48485.2024.10447307
摘要
This paper proposes a prior driven semi-supervised ViT-GAN called RC-ViTGAN for recoloring images while retaining color harmonization and semantic rationality. The encoder of RC-ViTGAN is based on the vision transformer to avoid the locality of convolutional networks, which facilitates the extraction of global information from images. Besides, we release an RC500 dataset, which is the largest publicly accessible and pioneering dataset for recolorization, providing convenience for subsequent studies. In addition, we present a novel semi-supervised training strategy, including a prior-driven self-supervised initialization method using contrastive learning. The proposed training strategy leverages massive amounts of unlabeled and pseudo-labeled data, addressing the shortage of labeled data in re-colorization. Code and dataset are available at https://github.com/tsz12/RC-ViTGAN.git.
科研通智能强力驱动
Strongly Powered by AbleSci AI