计算机科学
情绪分析
模态(人机交互)
人工智能
钥匙(锁)
任务(项目管理)
特征(语言学)
模式
特征提取
比例(比率)
机器学习
自然语言处理
模式识别(心理学)
数据挖掘
语言学
哲学
社会科学
物理
计算机安全
管理
量子力学
社会学
经济
作者
Changkai Lin,Hongju Cheng,Qiang Rao,Yang Yang
出处
期刊:IEEE/ACM transactions on audio, speech, and language processing
[Institute of Electrical and Electronics Engineers]
日期:2024-01-01
卷期号:32: 1416-1429
标识
DOI:10.1109/taslp.2024.3361374
摘要
Sentiment analysis plays an indispensable part in human-computer interaction. Multimodal sentiment analysis can overcome the shortcomings of unimodal sentiment analysis by fusing multimodal data. However, how to extracte improved feature representations and how to execute effective modality fusion are two crucial problems in multimodal sentiment analysis. Traditional work uses simple sub-models for feature extraction, and they ignore features of different scales and fuse different modalities of data equally, making it easier to incorporate extraneous information and affect analysis accuracy. In this paper, we propose a Multimodal Sentiment Analysis model based on Multi-scale feature extraction and Multi-task learning (M $^{3}$ SA). First, we propose a multi-scale feature extraction method that models the outputs of different hidden layers with the method of channel attention. Second, a multimodal fusion strategy based on the key modality is proposed, which utilizes the attention mechanism to raise the proportion of the key modality and mines the relationship between the key modality and other modalities. Finally, we use the multi-task learning approach to train the proposed model, ensuring that the model can learn better feature representations. Experimental results on two publicly available multimodal sentiment analysis datasets demonstrate that the proposed method is effective and that the proposed model outperforms baselines.
科研通智能强力驱动
Strongly Powered by AbleSci AI