计算机科学
模式
过度拟合
人工智能
模态(人机交互)
因子(编程语言)
机器学习
代表(政治)
多模式学习
对比度(视觉)
人工神经网络
社会科学
社会学
政治
政治学
法学
程序设计语言
作者
Sun Ya,Sijie Mai,Haifeng Hu
出处
期刊:IEEE Signal Processing Letters
[Institute of Electrical and Electronics Engineers]
日期:2021-01-01
卷期号:28: 1650-1654
被引量:15
标识
DOI:10.1109/lsp.2021.3101421
摘要
Multimodal networks with richer information contents should always outperform the unimodal counterparts. In our experiment, however, we observe that this is not always the case. Prior efforts on multimodal tasks mainly tend to design a uniform optimization algorithm for all modalities, and yet only obtain a sub-optimal multimodal representation with the fusion of under-optimized unimodal representations, which are still challenged by performance drop on multimodal networks caused by heterogeneity among modalities. In this work, to remove the slowdowns in performance on multimodal tasks, we decouple the learning procedures of unimodal and multimodal networks by dynamically balancing the learning rates for various modalities, so that the modality-specific optimization algorithm for each modality can be obtained. Specifically, the adaptive tracking factor (ATF) is introduced to adjust the learning rate for each modality on a real-time basis. Furthermore, adaptive convergent equalization (ACE) and bilevel directional optimization (BDO) are proposed to equalize and update the ATF, avoiding sub-optimal unimodal representations due to overfitting or underfitting. Extensive experiments on multimodal sentiment analysis demonstrate that our method achieves superior performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI