计算机科学
图像翻译
人工智能
图像(数学)
翻译(生物学)
效率低下
变压器
降噪
计算机视觉
模式识别(心理学)
生物化学
化学
信使核糖核酸
经济
基因
微观经济学
物理
量子力学
电压
作者
Bin Xia,Yulun Zhang,Shiyin Wang,Yitong Wang,Xinglong Wu,Yapeng Tian,Wenming Yang,Radu Timotfe,Luc Van Gool
出处
期刊:Cornell University - arXiv
日期:2023-01-01
被引量:2
标识
DOI:10.48550/arxiv.2308.13767
摘要
The Diffusion Model (DM) has emerged as the SOTA approach for image synthesis. However, the existing DM cannot perform well on some image-to-image translation (I2I) tasks. Different from image synthesis, some I2I tasks, such as super-resolution, require generating results in accordance with GT images. Traditional DMs for image synthesis require extensive iterations and large denoising models to estimate entire images, which gives their strong generative ability but also leads to artifacts and inefficiency for I2I. To tackle this challenge, we propose a simple, efficient, and powerful DM framework for I2I, called DiffI2I. Specifically, DiffI2I comprises three key components: a compact I2I prior extraction network (CPEN), a dynamic I2I transformer (DI2Iformer), and a denoising network. We train DiffI2I in two stages: pretraining and DM training. For pretraining, GT and input images are fed into CPEN$_{S1}$ to capture a compact I2I prior representation (IPR) guiding DI2Iformer. In the second stage, the DM is trained to only use the input images to estimate the same IRP as CPEN$_{S1}$. Compared to traditional DMs, the compact IPR enables DiffI2I to obtain more accurate outcomes and employ a lighter denoising network and fewer iterations. Through extensive experiments on various I2I tasks, we demonstrate that DiffI2I achieves SOTA performance while significantly reducing computational burdens.
科研通智能强力驱动
Strongly Powered by AbleSci AI