Multi-level textual-visual alignment and fusion network for multimodal aspect-based sentiment analysis

计算机科学 模式 情绪分析 水准点(测量) 人工智能 可视化 自然语言处理 过程(计算) 操作系统 地理 社会学 大地测量学 社会科学
作者
You Li,Han Ding,Yuming Lin,Xinyu Feng,Liang Chang
出处
期刊:Artificial Intelligence Review [Springer Science+Business Media]
卷期号:57 (4) 被引量:7
标识
DOI:10.1007/s10462-023-10685-z
摘要

Abstract Multimodal Aspect-Based Sentiment Analysis (MABSA) is an essential task in sentiment analysis that has garnered considerable attention in recent years. Typical approaches in MABSA often utilize cross-modal Transformers to capture interactions between textual and visual modalities. However, bridging the semantic gap between modalities spaces and addressing interference from irrelevant visual objects at different scales remains challenging. To tackle these limitations, we present the Multi-level Textual-Visual Alignment and Fusion Network (MTVAF) in this work, which incorporates three auxiliary tasks. Specifically, MTVAF first transforms multi-level image information into image descriptions, facial descriptions, and optical characters. These are then concatenated with the textual input to form a textual+visual input, facilitating comprehensive alignment between visual and textual modalities. Next, both inputs are fed into an integrated text model that incorporates relevant visual representations. Dynamic attention mechanisms are employed to generate visual prompts to control cross-modal fusion. Finally, we align the probability distributions of the textual input space and the textual+visual input space, effectively reducing noise introduced during the alignment process. Experimental results on two MABSA benchmark datasets demonstrate the effectiveness of the proposed MTVAF, showcasing its superior performance compared to state-of-the-art approaches. Our codes are available at https://github.com/MKMaS-GUET/MTVAF .
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
1秒前
芋圆葡萄完成签到,获得积分10
1秒前
2秒前
689发布了新的文献求助10
2秒前
豚骨拉面发布了新的文献求助10
3秒前
研友_VZG7GZ应助yangbinsci0827采纳,获得10
3秒前
3秒前
zgt01发布了新的文献求助10
3秒前
shilong.yang发布了新的文献求助30
3秒前
zzz发布了新的文献求助10
4秒前
量子星尘发布了新的文献求助10
4秒前
不是山谷完成签到,获得积分10
5秒前
6秒前
迷路海蓝发布了新的文献求助20
6秒前
田様应助张鱼小丸子采纳,获得20
7秒前
郭天乐发布了新的文献求助10
7秒前
能干的向真应助689采纳,获得10
8秒前
qunqing3完成签到,获得积分10
9秒前
zbzfp完成签到,获得积分10
9秒前
二甲亚砜完成签到 ,获得积分10
10秒前
10秒前
张资阳发布了新的文献求助10
10秒前
11秒前
11秒前
11秒前
无私的芹应助温暖的天晴采纳,获得10
12秒前
12秒前
郭天乐完成签到,获得积分10
12秒前
hustzwqq完成签到,获得积分10
12秒前
13秒前
wlq完成签到,获得积分20
14秒前
14秒前
123发布了新的文献求助10
15秒前
15秒前
15秒前
tamo发布了新的文献求助10
15秒前
砚行书完成签到,获得积分10
16秒前
Lucas应助zbzfp采纳,获得10
16秒前
scy完成签到 ,获得积分10
17秒前
高分求助中
The Mother of All Tableaux Order, Equivalence, and Geometry in the Large-scale Structure of Optimality Theory 2400
Ophthalmic Equipment Market by Devices(surgical: vitreorentinal,IOLs,OVDs,contact lens,RGP lens,backflush,diagnostic&monitoring:OCT,actorefractor,keratometer,tonometer,ophthalmoscpe,OVD), End User,Buying Criteria-Global Forecast to2029 2000
A new approach to the extrapolation of accelerated life test data 1000
Cognitive Neuroscience: The Biology of the Mind 1000
Cognitive Neuroscience: The Biology of the Mind (Sixth Edition) 1000
Optimal Transport: A Comprehensive Introduction to Modeling, Analysis, Simulation, Applications 800
Official Methods of Analysis of AOAC INTERNATIONAL 600
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 冶金 细胞生物学 免疫学
热门帖子
关注 科研通微信公众号,转发送积分 3958850
求助须知:如何正确求助?哪些是违规求助? 3505102
关于积分的说明 11122496
捐赠科研通 3236558
什么是DOI,文献DOI怎么找? 1788899
邀请新用户注册赠送积分活动 871424
科研通“疑难数据库(出版商)”最低求助积分说明 802794