模态(人机交互)
计算机科学
情态动词
利用
情绪分析
人工智能
自然语言处理
模式
机器学习
模式识别(心理学)
高分子化学
社会科学
化学
计算机安全
社会学
作者
Jingzhe Li,Chengji Wang,Zhiming Luo,Yuxian Wu,Xingpeng Jiang
标识
DOI:10.1109/icassp48485.2024.10445820
摘要
Recognizing human feelings from image and text is a core challenge of multi-modal data analysis, often applied in personalized advertising. Previous works aim at exploring the shared features, which are the matched contents between images and texts. However, the modality-dependent sentiment information (private features) in each modality is usually ignored by cross-modal interactions, the real sentiment is often reflected in one modality. In this paper, we propose a Modality-Dependent Sentiment Exploring framework (MDSE). First, to exploit the private features, we compare shared features with original image or text features, identifying previously overlooked unimodal features. Fusing the private and shared features can make the model more robust. Second, in order to obtain unified sentiment representations, we treat unimodal features and multi-modal fused features equally. We introduce a Modality-Agnostic Contrastive Loss (MACL) that performs contrastive learning between unimodal features and multi-modal fused features. The MACL can fully exploit sentiment information from multi-modal data and reduce the modality gap. Experiments on four public datasets demonstrate the effectiveness of our MDSE compared with existing methods. The full codes are available at https://github.com/royal-dargon/MDSE.
科研通智能强力驱动
Strongly Powered by AbleSci AI