计算机科学
稳健性(进化)
人工智能
机器学习
传感器融合
融合
融合机制
过程(计算)
渲染(计算机图形)
数据挖掘
语言学
哲学
生物化学
化学
脂质双层融合
基因
操作系统
作者
Jindi Lv,Yanan Sun,Qing Ye,Wentao Feng,Jiancheng Lv
标识
DOI:10.1016/j.ins.2024.121005
摘要
Multimodal fusion, a machine learning technique, significantly enhances decision-making by leveraging complementary information extracted from different data modalities. The success of multimodal fusion relies heavily on the design of the fusion scheme. However, this process traditionally depends on manual expertise and exhaustive trials. To tackle this challenge, researchers have undertaken studies on DARTS-based Neural Architecture Search (NAS) variants to automate the search of fusion schemes. In this paper, we present theoretical and empirical evidence that highlights the presence of catastrophic search bias in DARTS-based multimodal fusion methods. This bias traps the search into a deceptive optimal childnet, rendering the entire search process ineffective. To circumvent this phenomenon, we introduce a novel NAS framework for multimodal fusion, featuring a robust search strategy and a meticulously designed multi-scale fusion search space. Significantly, the proposed framework is capable of capturing modality-specific information across multiple scales while achieving an automatic balance between intra-modal and inter-modal information. We conduct extensive experiments on three commonly used multimodal classification tasks from different domains and compare the proposed framework against state-of-the-art approaches. The experimental results demonstrate the superior robustness and high efficiency of the proposed framework.
科研通智能强力驱动
Strongly Powered by AbleSci AI