Optimized single-image super-resolution reconstruction: A multimodal approach based on reversible guidance and cyclical knowledge distillation

计算机科学蒸馏图像（数学）人工智能分辨率（逻辑）超分辨率计算机视觉机器学习色谱法化学

作者

Jingke Yan,Qin Wang,Cheng Yao,ZhaoYu Su,Fan Zhang,MeiLing Zhong,Lei Liu,Bo Jin,Weihua Zhang

出处

期刊：Engineering Applications of Artificial Intelligence [Elsevier BV]
日期：2024-05-06 卷期号：133: 108496-108496 被引量：2

标识

DOI：10.1016/j.engappai.2024.108496

摘要

This paper proposes a new approach for reconstructing high-resolution images from low-resolution inputs using Denoising Diffusion Probabilistic Models (DDPMs). Existing DDPMs, while promising, face two issues: one is detail discrepancies due to the uncertain degradation factors in low-resolution images, the other is slow sampling speeds. To address these, a multimodal approach based on reversible guidance and cyclical knowledge distillation (MRKD) is introduced. This method is based on the concept where prior and posterior probabilities can assist in comprehending and predicting future events from available data and information. In the MRKD method, text and image information are separately encoded, and novel constraints are applied on prior and posterior distributions, optimizing the detailed features of the reconstructed image. In addition, due to the uncertainty of degradation factors in low-resolution images, a 'one-to-many' mapping issue arises in single-image super-resolution tasks. In response to this, the paper redefines constraints on the posterior distribution using the log-likelihood. Specifically, the Bayesian transformation of the input and output of the observation model is employed to effectively guide the diffusion process. To boost the slow sampling speed of DDPM, a cyclical knowledge distillation strategy is proposed, allowing iterative transfer of learned parameters from a high-step DDPM to a low-step model, thereby accelerating the sampling process while preserving image quality. The experimental results demonstrate that these strategies enable the model to effectively comprehend the high-level semantics and contextual information within images. Additionally, they address challenges associated with mode collapse, the loss of high-frequency details, and the complexities of long-tail data.

求助该文献

最长约 10秒，即可获得该文献文件

Optimized single-image super-resolution reconstruction: A multimodal approach based on reversible guidance and cyclical knowledge distillation

今日热心研友