人工智能
计算机科学
红外线的
融合
计算机视觉
网(多面体)
对象(语法)
航程(航空)
光学
数学
物理
材料科学
哲学
语言学
几何学
复合材料
作者
Haolong Fu,Shi-Xun Wang,Puhong Duan,Changyan Xiao,Renwei Dian,Shutao Li,Zhiyong Li
出处
期刊:IEEE transactions on neural networks and learning systems
[Institute of Electrical and Electronics Engineers]
日期:2023-06-07
卷期号:35 (10): 13232-13245
被引量:16
标识
DOI:10.1109/tnnls.2023.3266452
摘要
Visible–infrared object detection aims to improve the detector performance by fusing the complementarity of visible and infrared images. However, most existing methods only use local intramodality information to enhance the feature representation while ignoring the efficient latent interaction of long-range dependence between different modalities, which leads to unsatisfactory detection performance under complex scenes. To solve these problems, we propose a feature-enhanced long-range attention fusion network (LRAF-Net), which improves detection performance by fusing the long-range dependence of the enhanced visible and infrared features. First, a two-stream CSPDarknet53 network is used to extract the deep features from visible and infrared images, in which a novel data augmentation (DA) method is designed to reduce the bias toward a single modality through asymmetric complementary masks. Then, we propose a cross-feature enhancement (CFE) module to improve the intramodality feature representation by exploiting the discrepancy between visible and infrared images. Next, we propose a long-range dependence fusion (LDF) module to fuse the enhanced features by associating the positional encoding of multimodality features. Finally, the fused features are fed into a detection head to obtain the final detection results. Experiments on several public datasets, i.e., VEDAI, FLIR, and LLVIP, show that the proposed method obtains state-of-the-art performance compared with other methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI