计算机科学
计算机视觉
人工智能
变压器
图像融合
红外线的
图像(数学)
光学
电气工程
工程类
电压
物理
作者
Jun Chen,Jianfeng Ding,Jiayi Ma
标识
DOI:10.1109/tmm.2024.3405714
摘要
This study proposes an innovative network to fuse infrared and visible images, called HitFusion, which uses the cross-feature transformer module and is compatible with high-level vision tasks. Firstly, existing image fusion approaches primarily concentrate on optimizing human visual perception and image metrics. To enhance the performance of the fusion network in subsequent high-level vision tasks, a segmentation network and a corresponding loss are introduced into the fusion network training process. Specifically, we devise a three-stage training strategy to render the fusion network more suitable for high-level vision tasks, guided by the segmentation network and broadening the fusion network's training set to boost its generalization capability. Secondly, current transformer-based image fusion methods neglect the interaction between visible texture features and infrared contrast features. To tackle this, the cross-feature transformer module is proposed, allowing the fusion network to learn the cross-feature correlation and long-range dependencies between source images, thus achieving fusion results with good complementarity. Finally, a dual-branch fusion network is proposed, based on the distinct characteristics of different images, that targets the extraction of deep features from source images utilizing contrast residual and texture enhancement modules to achieve improved fusion results. Extensive experimental results reveal that our HitFusion method excels in both qualitative and quantitative assessments, while also demonstrating superior performance in addressing high-level vision tasks.
科研通智能强力驱动
Strongly Powered by AbleSci AI