计算机科学
计算
卷积(计算机科学)
变压器
图像(数学)
航程(航空)
计算复杂性理论
人工智能
图像分辨率
算法
理论计算机科学
计算机工程
模式识别(心理学)
人工神经网络
电压
量子力学
物理
复合材料
材料科学
作者
Xindong Zhang,Hui Zeng,Shi Guo,Lei Zhang
标识
DOI:10.1007/978-3-031-19790-1_39
摘要
Recently, transformer-based methods have demonstrated impressive results in various vision tasks, including image super-resolution (SR), by exploiting the self-attention (SA) for feature extraction. However, the computation of SA in most existing transformer based models is very expensive, while some employed operations may be redundant for the SR task. This limits the range of SA computation and consequently limits the SR performance. In this work, we propose an efficient long-range attention network (ELAN) for image SR. Specifically, we first employ shift convolution (shift-conv) to effectively extract the image local structural information while maintaining the same level of complexity as 1 $$\,\times \,$$ 1 convolution, then propose a group-wise multi-scale self-attention (GMSA) module, which calculates SA on non-overlapped groups of features using different window sizes to exploit the long-range image dependency. A highly efficient long-range attention block (ELAB) is then built by simply cascading two shift-conv with a GMSA module, which is further accelerated by using a shared attention mechanism. Without bells and whistles, our ELAN follows a fairly simple design by sequentially cascading the ELABs. Extensive experiments demonstrate that ELAN obtains even better results against the transformer-based SR models but with significantly less complexity. The source codes of ELAN can be found at https://github.com/xindongzhang/ELAN.
科研通智能强力驱动
Strongly Powered by AbleSci AI