计算机科学
人工智能
特征(语言学)
卷积神经网络
特征学习
融合
变压器
计算机视觉
模式识别(心理学)
图像(数学)
语言学
量子力学
物理
哲学
电压
作者
Weisheng Li,Yin Zhang,Guofen Wang,Yuping Huang,Ruyue Li
标识
DOI:10.1016/j.bspc.2022.104402
摘要
• A model combining CNN module and the transformer module is proposed for multimodal medical image fusion. The CNN module is used to extract image detail texture information, while the transformer module is used to extract image pixel intensity distribution information. In the Harvard brain atlas test dataset, extensive experimental results demonstrate that the proposed method outperforms comparable algorithms. • A GSFAM is proposed and applied to the encoder, which can fully aggregate the different features learned by the multi-scale transformer module. Subjective and objective experiments have also shown that the addition of this module can significantly improve the quality of reconstructed images. • A fusion strategy combining the maximum of local energy information and the gradient information of the image is proposed and applied to the multimodal medical image fusion tasks of MRI-PET and MRI-SPECT. The texture details of the original image are well preserved without losing the more important pixel distribution difference structure information of the original image. • The massive experimental results show that the proposed medical image fusion algorithm expresses some advantages over the classical medical image fusion algorithms in objective and subjective evaluation. These results contribute to the further development of medical image fusion. We hope this paper is suitable for “Biomedical Signal Processing and Control”. In recent times, several medical image fusion techniques based on the convolutional neural network (CNN) have been proposed for various medical imaging fusion tasks. However, these methods cannot model the long-range dependencies between the fused image and the source images. To address this limitation, we propose DFENet, a multimodal medical image fusion framework that integrates CNN feature learning and vision transformer feature learning using self-supervised learning. DFENet is based on an encoder-decoder network, which can be trained on large-scale natural image dataset without the need for carefully collated ground truth fusion images. The proposed network consists of an encoder, a feature fuser, and a decoder. The encoder is composed of a CNN module and a transformer module, which is used to extract local and global features of images. In order to avoid the use of simple up-sampling and concatenate processing, a new global semantic information aggregation module is proposed to efficiently aggregate the multi-scale features obtained by the transformer module, which enhances the quality of the reconstructed images. The decoder is composed of six convolution layers with two skip connections, which are used for the reconstruction from fused features. We also propose a fusion strategy combining local energy and gradient information for the feature fusion process of magnetic resonance imaging and functional medical images. Compared to conventional fusion rules, our fusion strategy is more robust to noisy images. And compared with the existing competitive methods, our method retains more texture details of the original images and outputs a more natural and realistic fused image.
科研通智能强力驱动
Strongly Powered by AbleSci AI