图像翻译
计算机科学
人工智能
翻译(生物学)
模式识别(心理学)
图像(数学)
分歧(语言学)
不变(物理)
公制(单位)
代表(政治)
度量(数据仓库)
航程(航空)
数学
数据挖掘
化学
政治学
运营管理
数学物理
法学
材料科学
语言学
生物化学
经济
复合材料
政治
基因
信使核糖核酸
哲学
作者
Hsin-Ying Lee,Hung-Yu Tseng,Qi Mao,Jia‐Bin Huang,Yu-Ding Lu,Maneesh Singh,Ming–Hsuan Yang
标识
DOI:10.1007/s11263-019-01284-z
摘要
Image-to-image translation aims to learn the mapping between two visual domains. There are two main challenges for this task: (1) lack of aligned training pairs and (2) multiple possible outputs from a single input image. In this work, we present an approach based on disentangled representation for generating diverse outputs without paired training images. To synthesize diverse outputs, we propose to embed images onto two spaces: a domain-invariant content space capturing shared information across domains and a domain-specific attribute space. Our model takes the encoded content features extracted from a given input and attribute vectors sampled from the attribute space to synthesize diverse outputs at test time. To handle unpaired training data, we introduce a cross-cycle consistency loss based on disentangled representations. Qualitative results show that our model can generate diverse and realistic images on a wide range of tasks without paired training data. For quantitative evaluations, we measure realism with user study and Fréchet inception distance, and measure diversity with the perceptual distance metric, Jensen–Shannon divergence, and number of statistically-different bins.
科研通智能强力驱动
Strongly Powered by AbleSci AI