计算机科学
变压器
地点
超分辨率
图像(数学)
像素
人工智能
算法
计算机工程
电气工程
电压
工程类
哲学
语言学
作者
Ling Zheng,Jinchen Zhu,Jinpeng Shi,Shizhuang Weng
标识
DOI:10.1016/j.engappai.2024.108035
摘要
Recently, transformer-based methods have achieved impressive results in single image super-resolution (SISR). However, the lack of locality mechanism and high complexity limit their application. To solve these problems, we propose a new method, Efficient Mixed Transformer (EMT), in this study. Specifically, we propose the Mixed Transformer Block (MTB), consisting of multiple consecutive transformer layers, in some of which the Pixel Mixer (PM) is used to replace the Self-Attention (SA). PM can enhance the local knowledge aggregation with pixel mismatch operations, and no additional complexity is introduced as PM has no parameters and floating-point operations. Moreover, we develop striped window SA to gain an efficient global dependency modeling by utilizing image anisotropy. Experimental results show that EMT outperforms the existing methods on benchmark dataset and achieved state-of-the-art performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI