已入深夜,您辛苦了!由于当前在线用户较少,发布求助请尽量完整的填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!祝你早点完成任务,早点休息,好梦!

Multimodal Emotion Recognition Fusion Analysis Adapting BERT With Heterogeneous Feature Unification

计算机科学 模式 人工智能 情绪分析 语音识别 情感计算 特征(语言学) 模态(人机交互) 面部表情 多模式学习 人机交互 社会科学 语言学 哲学 社会学
作者
SangHyun Lee,David K. Han,Hanseok Ko
出处
期刊:IEEE Access [Institute of Electrical and Electronics Engineers]
卷期号:9: 94557-94572 被引量:3
标识
DOI:10.1109/access.2021.3092735
摘要

Human communication includes rich emotional content, thus the development of multimodal emotion recognition plays an important role in communication between humans and computers. Because of the complex emotional characteristics of a speaker, emotional recognition remains a challenge, particularly in capturing emotional cues across a variety of modalities, such as speech, facial expressions, and language. Audio and visual cues are particularly vital for a human observer in understanding emotions. However, most previous work on emotion recognition has been based solely on linguistic information, which can overlook various forms of nonverbal information. In this paper, we present a new multimodal emotion recognition approach that improves the BERT model for emotion recognition by combining it with heterogeneous features based on language, audio, and visual modalities. Specifically, we improve the BERT model due to the heterogeneous features of the audio and visual modalities. We introduce the Self-Multi-Attention Fusion module, Multi-Attention fusion module, and Video Fusion module, which are attention based multimodal fusion mechanisms using the recently proposed transformer architecture. We explore the optimal ways to combine fine-grained representations of audio and visual features into a common embedding while combining a pre-trained BERT model with modalities for fine-tuning. In our experiment, we evaluate the commonly used CMU-MOSI, CMU-MOSEI, and IEMOCAP datasets for multimodal sentiment analysis. Ablation analysis indicates that the audio and visual components make a significant contribution to the recognition results, suggesting that these modalities contain highly complementary information for sentiment analysis based on video input. Our method shows that we achieve state-of-the-art performance on the CMU-MOSI, CMU-MOSEI, and IEMOCAP dataset.

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
SciGPT应助随便采纳,获得10
刚刚
3秒前
张二十八发布了新的文献求助10
3秒前
大鸭子发布了新的文献求助10
3秒前
eplision发布了新的文献求助10
4秒前
9秒前
上官若男应助yuanzhilong采纳,获得40
11秒前
11秒前
zhangbh1990完成签到 ,获得积分10
13秒前
随便发布了新的文献求助10
14秒前
欣慰问柳发布了新的文献求助10
16秒前
我是老大应助doctor2023采纳,获得10
18秒前
HEIKU应助张二十八采纳,获得10
19秒前
HEIKU应助张二十八采纳,获得10
19秒前
19秒前
19秒前
深情安青应助科研通管家采纳,获得10
20秒前
完美世界应助科研通管家采纳,获得10
20秒前
李爱国应助科研通管家采纳,获得10
20秒前
22秒前
22秒前
David发布了新的文献求助10
22秒前
23秒前
yuanzhilong发布了新的文献求助40
25秒前
myelin发布了新的文献求助10
27秒前
chenrujian发布了新的文献求助10
28秒前
28秒前
今天真暖发布了新的文献求助10
34秒前
香妃发布了新的文献求助10
34秒前
彭于晏应助欣慰问柳采纳,获得10
35秒前
爆米花应助外向半青采纳,获得10
36秒前
39秒前
40秒前
脑洞疼应助Yy采纳,获得10
41秒前
Daisy666发布了新的文献求助10
45秒前
乐乐应助澄子采纳,获得10
46秒前
丘比特应助dayu采纳,获得10
47秒前
47秒前
小二郎应助和谐烨霖采纳,获得10
48秒前
49秒前
高分求助中
LNG地下式貯槽指針(JGA指-107) 1000
LNG地上式貯槽指針 (JGA指 ; 108) 1000
Impact of Mitophagy-Related Genes on the Diagnosis and Development of Esophageal Squamous Cell Carcinoma via Single-Cell RNA-seq Analysis and Machine Learning Algorithms 900
QMS18Ed2 | process management. 2nd ed 600
LNG as a marine fuel—Safety and Operational Guidelines - Bunkering 560
Exploring Mitochondrial Autophagy Dysregulation in Osteosarcoma: Its Implications for Prognosis and Targeted Therapy 526
九经直音韵母研究 500
热门求助领域 (近24小时)
化学 医学 材料科学 生物 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 物理化学 催化作用 免疫学 细胞生物学 电极
热门帖子
关注 科研通微信公众号,转发送积分 2936744
求助须知:如何正确求助?哪些是违规求助? 2592682
关于积分的说明 6984783
捐赠科研通 2237001
什么是DOI,文献DOI怎么找? 1187945
版权声明 589933
科研通“疑难数据库(出版商)”最低求助积分说明 581573