亲爱的研友该休息了!由于当前在线用户较少,发布求助请尽量完整的填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!身体可是革命的本钱,早点休息,好梦!

Aspect-level multimodal co-attention graph convolutional sentiment analysis model

情绪分析 计算机科学 模式 判决 图形 人工智能 自然语言处理 情报检索 理论计算机科学 社会科学 社会学
作者
Shunjie Wang,Guoyong Cai,Guangrui Lyu,W. F. Tang
出处
期刊:Journal of Image and Graphics 卷期号:28 (12): 3838-3854 被引量:1
标识
DOI:10.11834/jig.221015
摘要

目的 方面级多模态情感分析日益受到关注,其目的是预测多模态数据中所提及的特定方面的情感极性。然而目前的相关方法大都对方面词在上下文建模、模态间细粒度对齐的指向性作用考虑不够,限制了方面级多模态情感分析的性能。为了解决上述问题,提出一个方面级多模态协同注意图卷积情感分析模型(aspect-level multimodal co-attention graph convolutional sentiment analysis model,AMCGC)来同时建模方面指向的模态内上下文语义关联和跨模态的细粒度对齐,以提升情感分析性能。方法 AMCGC为了获得方面导向的模态内的局部语义相关性,利用正交约束的自注意力机制生成各个模态的语义图。然后,通过图卷积获得含有方面词的文本语义图表示和融入方面词的视觉语义图表示,并设计两个不同方向的门控局部跨模态交互机制递进地实现文本语义图表示和视觉语义图表示的细粒度跨模态关联互对齐,从而降低模态间的异构鸿沟。最后,设计方面掩码来选用各模态图表示中方面节点特征作为情感表征,并引入跨模态损失降低异质方面特征的差异。结果 在两个多模态数据集上与9种方法进行对比,在Twitter-2015数据集中,相比于性能第2的模型,准确率提高了1.76%;在Twitter-2017数据集中,相比于性能第2的模型,准确率提高了1.19%。在消融实验部分则从正交约束、跨模态损失、交叉协同多模态融合分别进行评估,验证了AMCGC模型各部分的合理性。结论 本文提出的AMCGC模型能更好地捕捉模态内的局部语义相关性和模态之间的细粒度对齐,提升方面级多模态情感分析的准确性。;Objective The main task of aspect-level multimodal sentiment analysis is to determine the sentiment polarity of a given target(i. e., aspect or entity)in a sentence by combining relevant modal data sources. This task is considered a fine-grained target-oriented sentiment analysis task. Traditional sentiment analysis mainly focuses on the content of text data. However, with the increasing amount of audio, image, video, and other media data, merely focusing on the sentiment analysis of single text data would be insufficient. Multimodal sentiment analysis surpasses traditional sentiment analysis based on a single text content in understanding human behavior and hence offers more practical significance and application value. Aspect-level multimodal sentiment analysis(AMSA)has attracted increasing application in revealing the finegrained emotions of social users. Unlike coarse-grained multimodal sentiment analysis, AMSA not only considers the potential correlation between modalities but also focuses on guiding the aspects toward their respective modalities. However, the current AMSA methods do not sufficiently consider the directional effect of aspect words in the context modeling of different modalities and the fine-grained alignment between modalities. Moreover, the fusion of image and text representations is mostly coarse grained, thereby leading to the insufficient mining of collaborative associations between modalities and limiting the performance of aspect-level multimodal sentiment analysis. To solve these problems, the aspect-level multimodal co-attention graph convolutional sentiment analysis model(AMCGC)is proposed to simultaneously consider the aspectoriented intra-modal context semantic association and the fine-grained alignment across the modality to improve sentiment analysis performance. Method AMCGC is an end-to-end aspect-level sentiment analysis method that mainly involves four stages, namely, input embedding, feature extraction, pairwise graph convolution of cross-modality alternating coattention, and aspect mask setting. First, after obtaining the image and text embedding representations, a contextual sequence of text features containing aspect words and a contextual sequence of visual local features incorporating aspect words are constructed. To explicitly model the directional semantics of aspect words, position encoding is added to the context sequences of text and visual local features based on the aspect words. Afterward, the context sequences of different modalities are inputted into bidirectional long short-term memory networks to obtain the context dependencies of the respective modalities. To obtain the local semantic correlations of intra-modality for aspect-oriented modalities, a self-attention mechanism with orthogonal constraints is designed to generate semantic graphs for each modality. A textual semantic graph representation containing aspect words and a visual semantic graph representation incorporating aspect words are then obtained through a graph convolutional network to accurately capture the local semantic correlation within the modality. Among them, the orthogonal constraint can model the local sentiment semantic relationship of data units inside the modality as explicitly as possible and enhance the discrimination of the local features within the modality. A gated local crossmodality interaction mechanism is also designed to embed the text semantic graph representation into the visual semantic graph representation. The graph convolution network is then used again to learn the local dependencies of different modalities'graph representations, and the text embedded in the visual semantic graph representation is inversely embedded into the text semantic graph representation so as to achieve a fine-grained cross-modality association alignment, thereby reducing the heterogeneous gap between modalities. Aspect mask settings are designed to select the aspect node features in the respective modalities'semantic graph representation as the final sentiment representation, and cross-modal loss is introduced to reduce the differences in cross-modal aspect features. Result The performance of the proposed model is compared with that of nine latest methods on two public multimodal datasets. The accuracy(ACC)of the proposed model is improved by 1. 76% and 1. 19% on the Twitter-2015 and Twitter-2017 datasets, respectively, compared to those models with the second-highest performance. Experimental results confirm the advantage of using graph convolutional networks to model the local semantic relation interaction alignment within modalities from a local perspective and highlight the superiority of performing multimodal interaction in a cross-collaborative manner. The model is then subjected to an ablation study from the perspectives of orthogonal constraints, cross-modal loss, cross-coordinated multimodal fusion, and feature redundancy, and experiments are conducted on the Twitter-2015 and Twitter-2017 datasets. Experimental results show that the results of all ablation solutions are inferior to the performance of the AMCGC model, thus validating the rationality of each part of the AMCGC model. Moreover, the orthogonal constraint has the greatest effect, and the absence of this constraint greatly reduces in the effectiveness of the model. Specifically, removing this constraint reduces the ACC of the proposed model by 1. 83% and 3. 81% on the Twitter-2015 and Twitter-2017 datasets, respectively. In addition, the AMCGC+ BERT model, which is based on bidirectional encoder representation from Transformer(BERT)pre-training, outperforms the AMCGC model based on Glove. The ACC of the AMCGC+ BERT model is increased by 1. 93% and 2. 19% on the Twitter-2015 and Twitter-2017 datasets, thereby suggesting that the large-scale pretraining-based model has more advantages in obtaining word representations. The hyperparameters of this model are set through extensive experiments, such as determining the number of image regions and the weights of the orthogonal constraint terms. Visualization experiments prove that the AMCGC model can capture the local semantic correlation within modalities. Conclusion The proposed AMCGC model can efficiently capture the local semantic correlation within modalities under the constraint of orthogonal terms. This model can also effectively achieve a fine-grained alignment between multimodalities and improve the accuracy of aspect-level multimodal sentiment analysis.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
5秒前
hwen1998完成签到 ,获得积分10
6秒前
了晨发布了新的文献求助10
6秒前
现代的战斗机完成签到,获得积分10
8秒前
顾矜应助古月采纳,获得10
8秒前
9秒前
Chloe发布了新的文献求助10
10秒前
13秒前
wwwcz发布了新的文献求助10
20秒前
黄毛虎完成签到 ,获得积分10
28秒前
溴氧铋完成签到 ,获得积分10
30秒前
杳鸢应助科研通管家采纳,获得30
33秒前
杳鸢应助科研通管家采纳,获得30
33秒前
斯文败类应助科研通管家采纳,获得10
33秒前
英姑应助科研通管家采纳,获得10
33秒前
所所应助Chloe采纳,获得10
35秒前
40秒前
科研通AI5应助bjbmtxy采纳,获得10
41秒前
JamesPei应助wwwcz采纳,获得10
43秒前
冰西瓜完成签到 ,获得积分10
47秒前
程新亮完成签到 ,获得积分10
48秒前
科研通AI5应助昏睡的雪糕采纳,获得10
51秒前
科研通AI5应助yesiDo采纳,获得10
51秒前
开心凌柏发布了新的文献求助10
52秒前
wwwcz完成签到,获得积分10
52秒前
阳光完成签到,获得积分10
56秒前
万能图书馆应助北冥有鱼采纳,获得10
56秒前
57秒前
57秒前
Chloe发布了新的文献求助10
58秒前
59秒前
清秀紫南完成签到 ,获得积分10
1分钟前
称心储发布了新的文献求助10
1分钟前
帅伟完成签到,获得积分10
1分钟前
昏睡的雪糕完成签到,获得积分20
1分钟前
yesiDo发布了新的文献求助10
1分钟前
帅伟发布了新的文献求助10
1分钟前
香蕉觅云应助ghjkl采纳,获得10
1分钟前
1分钟前
Jemery发布了新的文献求助10
1分钟前
高分求助中
Continuum Thermodynamics and Material Modelling 3000
Production Logging: Theoretical and Interpretive Elements 2700
Les Mantodea de Guyane Insecta, Polyneoptera 1000
Structural Load Modelling and Combination for Performance and Safety Evaluation 1000
Conference Record, IAS Annual Meeting 1977 820
England and the Discovery of America, 1481-1620 600
A Modified Hierarchical Risk Parity Framework for Portfolio Management 555
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 基因 遗传学 物理化学 催化作用 量子力学 光电子学 冶金
热门帖子
关注 科研通微信公众号,转发送积分 3575013
求助须知:如何正确求助?哪些是违规求助? 3144989
关于积分的说明 9457848
捐赠科研通 2846292
什么是DOI,文献DOI怎么找? 1564736
邀请新用户注册赠送积分活动 732598
科研通“疑难数据库(出版商)”最低求助积分说明 719171