素描
概率逻辑
计算机科学
面子(社会学概念)
降噪
人工智能
图像去噪
计算机视觉
扩散
模式识别(心理学)
算法
社会科学
热力学
物理
社会学
作者
Yue Que,Xiong Li,Weiguo Wan,Xue Xia,Zhiwei Liu
出处
期刊:IEEE Transactions on Circuits and Systems for Video Technology
[Institute of Electrical and Electronics Engineers]
日期:2024-01-01
卷期号:: 1-1
标识
DOI:10.1109/tcsvt.2024.3409184
摘要
The field of face sketch-to-photo synthesis involves generating photographic facial images with enhanced details and a heightened sense of style realism. In recent years, the advancement of deep learning techniques has significantly contributed to the development of methods for synthesizing photographic face images from sketches. Nevertheless, challenges remain in synthesizing facial photographs with richer details and more accurate structural representation. This paper introduces a novel architecture for face sketch-to-photo synthesis, using denoising diffusion probabilistic models (DDPM). Our approach simplifies the complex transformation process into sequential forward and backward denoising steps. We incorporate a pretrained coarse generator to effectively encode sketch information, integrating it into each backward step to guide the generative process toward accurate photo space representation. Furthermore, we design a detail diffusion branch to refine the coarse photo face generated from the coarse generator. By deeply fusing multiscale detail features from this branch with a sophisticated conditional noise predictor, our model effectively captures the correlation between detail and stylistic elements both in sketches and in photographic faces. Extensive experimental evaluations on three datasets show the effectiveness of our model, emphasizing its ability to synthesize facial photographs with remarkable realism and rich detail. The synthesized facial images consistently demonstrate superior face recognition accuracy, surpassing that of state-of-the-art methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI