人工智能
计算机科学
图像融合
计算机视觉
情态动词
融合
图像(数学)
模式识别(心理学)
语言学
哲学
高分子化学
化学
作者
Man Zhou,Naishan Zheng,Xuanhua He,Danfeng Hong,Jocelyn Chanussot
标识
DOI:10.1109/tpami.2024.3475485
摘要
Multi-modal image fusion aims to generate a fused image by integrating and distinguishing the cross-modality complementary information from multiple source images. While the cross-attention mechanism with global spatial interactions appears promising, it only captures second-order spatial interactions, neglecting higher-order interactions in both spatial and channel dimensions. This limitation hampers the exploitation of synergies between multi-modalities. To bridge this gap, we introduce a Synergistic High-order Interaction Paradigm (SHIP), designed to systematically investigate spatial fine-grained and global statistics collaborations between the multi-modal images across two fundamental dimensions: 1) Spatial dimension: we construct spatial fine-grained interactions through element-wise multiplication, mathematically equivalent to global interactions, and then foster high-order formats by iteratively aggregating and evolving complementary information, enhancing both efficiency and flexibility. 2) Channel dimension: expanding on channel interactions with first-order statistics (mean), we devise high-order channel interactions to facilitate the discernment of inter-dependencies between source images based on global statistics. We further introduce an enhanced version of the SHIP model, called SHIP++ that enhances the cross-modality information interaction representation by the cross-order attention evolving mechanism, cross-order information integration, and residual information memorizing mechanism. Harnessing high-order interactions significantly enhances our model's ability to exploit multi-modal synergies, leading in superior performance over state-of-the-art alternatives, as shown through comprehensive experiments across various benchmarks in two significant multi-modal image fusion tasks: pan-sharpening, and infrared and visible image fusion. The source code is publicly available at https://github.com/manman1995/HOIF.
科研通智能强力驱动
Strongly Powered by AbleSci AI