计算机科学
人工智能
机器学习
深度学习
自然语言处理
作者
Kangshun Li,Can Chen,Wuteng Cao,Hui Wang,Shuai Han,Renjie Wang,Zaisheng Ye,Zhijie Wu,Wenxiang Wang,Leng Cai,Deyu Ding,Zixu Yuan
标识
DOI:10.1016/j.compbiomed.2023.106715
摘要
Multimodal deep learning models have been applied for disease prediction tasks, but difficulties exist in training due to the conflict between sub-models and fusion modules. To alleviate this issue, we propose a framework for decoupling feature alignment and fusion (DeAF), which separates the multimodal model training into two stages. In the first stage, unsupervised representation learning is conducted, and the modality adaptation (MA) module is used to align the features from various modalities. In the second stage, the self-attention fusion (SAF) module combines the medical image features and clinical data using supervised learning. Moreover, we apply the DeAF framework to predict the postoperative efficacy of CRS for colorectal cancer and whether the MCI patients change to Alzheimer's disease. The DeAF framework achieves a significant improvement in comparison to the previous methods. Furthermore, extensive ablation experiments are conducted to demonstrate the rationality and effectiveness of our framework. In conclusion, our framework enhances the interaction between the local medical image features and clinical data, and derive more discriminative multimodal features for disease prediction. The framework implementation is available at https://github.com/cchencan/DeAF.
科研通智能强力驱动
Strongly Powered by AbleSci AI