计算机科学
人工智能
过度拟合
图像融合
机器学习
卷积神经网络
深度学习
转化(遗传学)
模式识别(心理学)
编码器
数据挖掘
人工神经网络
图像(数学)
生物化学
化学
基因
操作系统
作者
Linhao Qu,Shaolei Liu,Manning Wang,Shiman Li,Siqi Yin,Zhijian Song
标识
DOI:10.1016/j.eswa.2023.121363
摘要
Image fusion enhances a single image by integrating information from multiple sources with complementary data. Present end-to-end fusion methods often face overfitting or intricate parameter tuning due to inadequate task-specific training data. To address this, two-stage approaches utilize encoder–decoder networks trained on extensive natural image datasets, yet suffer from limited performance due to domain disparities. In this work, we devise a novel encoder–decoder fusion framework and introduce a self-supervised scheme based on destruction–reconstruction. This approach facilitates task-specific feature learning by proposing three auxiliary tasks: pixel intensity non-linear transformation for multi-modal fusion, brightness transformation for multi-exposure fusion, and noise transformation for multi-focus fusion. By randomly selecting one task during model training, we mutually reinforce different fusion tasks, enhancing network generalizability. We innovate an encoder combining Convolutional Neural Network (CNN) and Transformer to extract both local and global features. Rigorous evaluations against 11 traditional and deep learning-based methods span four benchmark datasets: infrared-visible fusion, medical fusion, multi-exposure fusion, and multi-focus fusion. Comprehensive assessments, encompassing nine metrics from diverse viewpoints, consistently demonstrate the superior performance of our approach in all scenarios. We will make our code, datasets, and fused images publicly available.
科研通智能强力驱动
Strongly Powered by AbleSci AI