计算机科学
人工智能
残余物
图像融合
情态动词
模式识别(心理学)
卷积神经网络
融合
像素
特征提取
卷积(计算机科学)
核(代数)
计算机视觉
人工神经网络
算法
图像(数学)
数学
语言学
哲学
化学
组合数学
高分子化学
作者
Wenqing Wang,Ji He,Han Liu,Wei Yuan
出处
期刊:Sensors
[MDPI AG]
日期:2024-06-21
卷期号:24 (13): 4056-4056
被引量:3
摘要
The fusion of multi-modal medical images has great significance for comprehensive diagnosis and treatment. However, the large differences between the various modalities of medical images make multi-modal medical image fusion a great challenge. This paper proposes a novel multi-scale fusion network based on multi-dimensional dynamic convolution and residual hybrid transformer, which has better capability for feature extraction and context modeling and improves the fusion performance. Specifically, the proposed network exploits multi-dimensional dynamic convolution that introduces four attention mechanisms corresponding to four different dimensions of the convolutional kernel to extract more detailed information. Meanwhile, a residual hybrid transformer is designed, which activates more pixels to participate in the fusion process by channel attention, window attention, and overlapping cross attention, thereby strengthening the long-range dependence between different modes and enhancing the connection of global context information. A loss function, including perceptual loss and structural similarity loss, is designed, where the former enhances the visual reality and perceptual details of the fused image, and the latter enables the model to learn structural textures. The whole network adopts a multi-scale architecture and uses an unsupervised end-to-end method to realize multi-modal image fusion. Finally, our method is tested qualitatively and quantitatively on mainstream datasets. The fusion results indicate that our method achieves high scores in most quantitative indicators and satisfactory performance in visual qualitative analysis.
科研通智能强力驱动
Strongly Powered by AbleSci AI