计算机科学
频域
人工智能
快速傅里叶变换
空间频率
卷积神经网络
卷积(计算机科学)
模式识别(心理学)
变压器
傅里叶变换
计算机视觉
算法
人工神经网络
数学
电压
数学分析
物理
量子力学
光学
作者
Mingwen Shao,Yuanjian Qiao,Deyu Meng,Wangmeng Zuo
标识
DOI:10.1016/j.knosys.2023.110306
摘要
Existing convolutional neural network (CNN)-based and vision Transformer (ViT)-based image restoration methods are usually explored in the spatial domain. However, we employ Fourier analysis to show that these spatial domain models cannot perceive the entire frequency spectrum of images, i.e., mainly focus on either high-frequency (CNN-based models) or low-frequency components (ViT-based models). This intrinsic limitation results in the partial missing of semantic information and the appearance of artifacts. To address this limitation, we propose a novel uncertainty-guided hierarchical frequency domain Transformer named HFDT to effectively learn both high and low-frequency information while perceiving local and global features. Specifically, to aggregate semantic information from various frequency levels, we propose a dual-domain feature interaction mechanism, in which the global frequency information and local spatial features are extracted by corresponding branches. The frequency domain branch adopts the Fast Fourier Transform (FFT) to convert the features from the spatial domain to the frequency domain, where the global low and high-frequency components are learned with Log-linear complexity. Complementarily, an efficient convolution group is employed in the spatial domain branch to capture local high-frequency details. Moreover, we introduce an uncertainty degradation-guided strategy to efficiently represent degraded prior information, rather than simply distinguishing degraded/non-degraded regions in binary form. Our approach achieves competitive results in several degraded scenarios, including rain streaks, raindrops, motion blur, and defocus blur.
科研通智能强力驱动
Strongly Powered by AbleSci AI