模态(人机交互)
人工智能
计算机科学
图像融合
融合
计算机视觉
图像(数学)
等变映射
模式
传感器融合
基本事实
情态动词
分割
模式识别(心理学)
数学
社会科学
哲学
语言学
化学
社会学
高分子化学
纯数学
作者
Zixiang Zhao,Haowen Bai,Jiangshe Zhang,Yulun Zhang,Kai Zhang,Shuang Xu,Dongdong Chen,Radu Timofte,Luc Van Gool
出处
期刊:Cornell University - arXiv
日期:2023-01-01
标识
DOI:10.48550/arxiv.2305.11443
摘要
Multi-modality image fusion is a technique used to combine information from different sensors or modalities, allowing the fused image to retain complementary features from each modality, such as functional highlights and texture details. However, effectively training such fusion models is difficult due to the lack of ground truth fusion data. To address this issue, we propose the Equivariant Multi-Modality imAge fusion (EMMA) paradigm for end-to-end self-supervised learning. Our approach is based on the prior knowledge that natural images are equivariant to specific transformations. Thus, we introduce a novel training framework that includes a fusion module and a learnable pseudo-sensing module, which allow the network training to follow the principles of physical sensing and imaging process, and meanwhile satisfy the equivariant prior for natural images. Our extensive experiments demonstrate that our method produces high-quality fusion results for both infrared-visible and medical images, while facilitating downstream multi-modal segmentation and detection tasks. The code will be released.
科研通智能强力驱动
Strongly Powered by AbleSci AI