棱锥(几何)
特征(语言学)
人工智能
融合
一般化
计算机科学
骨干网
模式识别(心理学)
情态动词
图像融合
图像(数学)
数学
化学
哲学
语言学
几何学
数学分析
计算机网络
高分子化学
作者
Huiqin Wu,Lihong Peng,Dongyang Du,Hui Xu,Guoyu Lin,Zidong Zhou,Lijun Lu,Wenbing Lv
标识
DOI:10.1088/1361-6560/ad3cb2
摘要
Abstract Objective . To go beyond the deficiencies of the three conventional multimodal fusion strategies (i.e. input-, feature- and output-level fusion), we propose a bidirectional attention-aware fluid pyramid feature integrated fusion network (BAF-Net) with cross-modal interactions for multimodal medical image diagnosis and prognosis. Approach . BAF-Net is composed of two identical branches to preserve the unimodal features and one bidirectional attention-aware distillation stream to progressively assimilate cross-modal complements and to learn supplementary features in both bottom-up and top-down processes. Fluid pyramid connections were adopted to integrate the hierarchical features at different levels of the network, and channel-wise attention modules were exploited to mitigate cross-modal cross-level incompatibility. Furthermore, depth-wise separable convolution was introduced to fuse the cross-modal cross-level features to alleviate the increase in parameters to a great extent. The generalization abilities of BAF-Net were evaluated in terms of two clinical tasks: (1) an in-house PET-CT dataset with 174 patients for differentiation between lung cancer and pulmonary tuberculosis. (2) A public multicenter PET-CT head and neck cancer dataset with 800 patients from nine centers for overall survival prediction. Main results . On the LC-PTB dataset, improved performance was found in BAF-Net (AUC = 0.7342) compared with input-level fusion model (AUC = 0.6825; p < 0.05), feature-level fusion model (AUC = 0.6968; p = 0.0547), output-level fusion model (AUC = 0.7011; p < 0.05). On the H&N cancer dataset, BAF-Net (C-index = 0.7241) outperformed the input-, feature-, and output-level fusion model, with 2.95%, 3.77%, and 1.52% increments of C-index ( p = 0.3336, 0.0479 and 0.2911, respectively). The ablation experiments demonstrated the effectiveness of all the designed modules regarding all the evaluated metrics in both datasets. Significance . Extensive experiments on two datasets demonstrated better performance and robustness of BAF-Net than three conventional fusion strategies and PET or CT unimodal network in terms of diagnosis and prognosis.
科研通智能强力驱动
Strongly Powered by AbleSci AI