计算机科学
信息瓶颈法
模式
瓶颈
任务(项目管理)
代表(政治)
模态(人机交互)
人工智能
最大化
情态动词
特征学习
机器学习
自然语言处理
相互信息
经济
法学
高分子化学
管理
化学
微观经济学
政治学
嵌入式系统
社会学
政治
社会科学
作者
Tonghui Zhang,Haiying Zhang,Shuke Xiang,Tong Wu
标识
DOI:10.1145/3522749.3523069
摘要
Recently, Multimodal Sentiment Analysis (MSA) has become a hot research topic of cross modal research in artificial intelligence domain. For this task, the research focuses on extract comprehensive information which dispersed in different modalities. In existing research works, some paid attention to the ingenious fusion method inspired by the consideration of intra-modality and inter-modality reaction, while others devoted to remove task-irrelevant information to refine single modal representation by imposing constraints. However, both of these are limited to the lack of effective control over information in the learning of multimodal representation. It may loss task-relevant information or introduce extra noise. In order to address the afore-mentioned issue, we propose a framework named Multimodal Information Bottleneck (MMIB) in this paper. By imposing mutual information constraints between different modal pairs (text-visual, acoustic-visual, text-acoustic) to control the maximization of mutual information between different modalities and minimization of mutual information inside single modalities, the task-irrelevant information in a single modal can be removed efficiency while kept the related ones, so that the multimodal representation is improved greatly. By the experiments on two widely used public datasets, it demonstrates that our proposed method outperforms existing methods (like MAG-BERT, Self-MM) in binary-classification and achieves a comparable performance in other evaluation metrics.
科研通智能强力驱动
Strongly Powered by AbleSci AI