修补
计算机科学
人工智能
图像(数学)
计算机视觉
忠诚
图像处理
模式识别(心理学)
电信
作者
Jinjia Peng,Mengkai Li,Bingyan Wang,Huibing Wang
标识
DOI:10.1109/tcsvt.2025.3532321
摘要
Image inpainting aims to restore a realistic image from a damaged or incomplete version. Although Transformer-based methods have achieved impressive results by modeling long-range dependencies, the inherent quadratic complexity of canonical self-attention has typically led to these approaches adopting uni-dimensional modeling, which limits the model’s ability to capture complex relationships from both spatial and channel dimensions. To this end, this paper exploits a novel attention paradigm termed Dynamic Omni-Attention Mechanism (DOAM) for simultaneously modeling pixel-interaction from both spatial and channel dimensions, and implements the information interaction across the omni-axis (i.e., spatial and channel) with linear computational complexity. In addition, to handle large-scale degradation, this paper proposes a Multi-band Feature Enhancement (MFE) module to enhance feature representation in downsampling, thus unlocking the potential of subsequent attentional interactions. Moreover, motivated by recent advances in image restoration, this paper incorporates a domain-related prior representation from CNN-based Network to modulate the features during proposed attention mechanism and feed-forward networks. Integrating the above designs into an encoder-decoder architecture, the proposed Omni Contextual Aggregation Networks (OCANet) achieve superior performance at lower parameters and time costs than the competitive baselines. Extensive experiments on CelebA-HQ, Paris Street View, FFHQ and Dunhuang datasets validate the efficacy of the proposed method.
科研通智能强力驱动
Strongly Powered by AbleSci AI