人工智能
计算机科学
任务(项目管理)
情态动词
分割
图像(数学)
网(多面体)
计算机视觉
数学
工程类
化学
几何学
系统工程
高分子化学
作者
Yingda Lyu,Zhehao Liu,Yingxin Zhang,Haipeng Chen,Zongyu Xu
标识
DOI:10.1016/j.cviu.2024.104138
摘要
Data from a single modality may suffer from noise, low contrast, or other imaging limitations that affect the model's accuracy. Furthermore, due to the limited amount of data, most models trained on single-modality data tend to overfit the training set and perform poorly on out-of-domain data. Therefore, in this paper, we propose a network named Cross-Modal Reasoning and Multi-Task Learning Network (CRML-Net), which combines cross-modal reasoning and multi-task learning, aiming to leverage the complementary information between different modalities and tasks to enhance the model's generalization ability and accuracy. Specifically, CRML-Net consists of two stages. In the first stage, our network extracts a new morphological information modality from the original image and then performs cross-modal fusion with the original modality image, aiming to leverage the morphological information to enhance the model's robustness to out-of-domain datasets. In the second stage, based on the output of the previous stage, we introduce a multi-task learning mechanism, aiming to improve the model's performance on unseen data by sharing surface detail information from auxiliary tasks. We validated our method on a publicly available tooth cone beam computed tomography dataset. Our evaluation demonstrates that our method outperforms state-of-the-art approaches.
科研通智能强力驱动
Strongly Powered by AbleSci AI