计算机科学
计算机视觉
人工智能
图像融合
变压器
图像配准
图像(数学)
工程类
电气工程
电压
作者
Xinyu Xie,Xiaozhi Zhang,Xinglong Tang,Jiaxi Zhao,Dongping Xiong,Lijun Ouyang,Bin Yang,Hong Zhou,Bingo Wing‐Kuen Ling,Kok Lay Teo
出处
期刊:IEEE Journal of Biomedical and Health Informatics
[Institute of Electrical and Electronics Engineers]
日期:2024-01-01
卷期号:: 1-12
被引量:2
标识
DOI:10.1109/jbhi.2024.3391620
摘要
Multimodal medical image fusion aims to integrate complementary information from different modalities of medical images. Deep learning methods, especially recent vision Transformers, have effectively improved image fusion performance. However, there are limitations for Transformers in image fusion, such as lacks of local feature extraction and cross-modal feature interaction, resulting in insufficient multimodal feature extraction and integration. In addition, the computational cost of Transformers is higher. To address these challenges, in this work, we develop an adaptive cross-modal fusion strategy for unsupervised multimodal medical image fusion. Specifically, we propose a novel lightweight cross Transformer based on cross multi-axis attention mechanism. It includes cross-window attention and cross-grid attention to mine and integrate both local and global interactions of multimodal features. The cross Transformer is further guided by a spatial adaptation fusion module, which allows the model to focus on the most relevant information. Moreover, we design a special feature extraction module that combines multiple gradient residual dense convolutional and Transformer layers to obtain local features from coarse to fine and capture global features. The proposed strategy significantly boosts the fusion performance while minimizing computational costs. Extensive experiments, including clinical brain tumor image fusion, have shown that our model can achieve clearer texture details and better visual quality than other state-of-the-art fusion methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI