稳健性(进化)
计算机科学
融合
人工智能
传感器融合
透视图(图形)
机器学习
情态动词
领域(数学)
多模态
数学
哲学
生物化学
化学
语言学
高分子化学
纯数学
基因
万维网
作者
Qingyang Zhang,Haitao Wu,Changqing Zhang,Qinghua Hu,Huazhu Fu,Joey Tianyi Zhou,Xi Peng
出处
期刊:Cornell University - arXiv
日期:2023-01-01
标识
DOI:10.48550/arxiv.2306.02050
摘要
The inherent challenge of multimodal fusion is to precisely capture the cross-modal correlation and flexibly conduct cross-modal interaction. To fully release the value of each modality and mitigate the influence of low-quality multimodal data, dynamic multimodal fusion emerges as a promising learning paradigm. Despite its widespread use, theoretical justifications in this field are still notably lacking. Can we design a provably robust multimodal fusion method? This paper provides theoretical understandings to answer this question under a most popular multimodal fusion framework from the generalization perspective. We proceed to reveal that several uncertainty estimation solutions are naturally available to achieve robust multimodal fusion. Then a novel multimodal fusion framework termed Quality-aware Multimodal Fusion (QMF) is proposed, which can improve the performance in terms of classification accuracy and model robustness. Extensive experimental results on multiple benchmarks can support our findings.
科研通智能强力驱动
Strongly Powered by AbleSci AI