计算机科学
人工智能
背景(考古学)
分割
残余物
特征(语言学)
计算机视觉
噪音(视频)
模式识别(心理学)
稀疏逼近
图像分辨率
空间频率
空间语境意识
编码器
图像(数学)
算法
光学
物理
古生物学
语言学
哲学
生物
操作系统
作者
Mingjin Zhang,Rui Zhang,Jing Zhang,Jie Guo,Yunsong Li,Xinbo Gao
标识
DOI:10.1109/tgrs.2023.3263848
摘要
Infrared small target detection (IRSTD) is important for many practical applications such as hazardous aircraft warning, especially when the target is not visible in visible light image due to atmospheric conditions such as fog and cloud. However, IRSTD is challenging due to noises, small and dim targets. To address this challenge, we propose a novel Dim2Clear Network (Dim2Clear) for IRSTD in this paper. Specifically, the Dim2Clear consists of a U-Net backbone encoder, a context mixer decoder (CMD) based on spatial and frequency attention (SFA), and an eyeball-shaped enhancement module (EEM). The CMD is composed of cascaded regular residual blocks where two SFA modules are inserted. Each SFA module receives features from different residual blocks and generates spatial attention map from them to modulate the low-level features, which are then decomposed into low and high frequencies using the discrete cosine transformation. Accordingly, features are further modulated according to the generated frequency attention maps. In this way, SFA can extract both spatial context and frequency context to improve the feature representation capacity. In addition, we design an EEM to suppress the noise and enhance the signal-to-noise ratio in the segmentation results from the perspective of image super-resolution. Experiments on the SIRST dataset and our newly constructed IRSTD-1k dataset show that the proposed Dim2Clear outperforms state-of-the-art methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI