计算机科学
融合
传感器融合
人机交互
人工智能
语言学
哲学
作者
Tenggan Zhang,Zhaopei Huang,Ruichen Li,Jinming Zhao,Qin Jin
出处
期刊:ACM Multimedia
日期:2021-10-24
被引量:1
标识
DOI:10.1145/3475957.3484452
摘要
Physiological-emotion analysis is a novel aspect of automatic emotion analysis. It can support revealing a subject's emotional state, even if he/she consciously suppresses the emotional expression. In this paper, we present our solutions for the MuSe-Physio sub-challenge of Multimodal Sentiment Analysis (MuSe) 2021. The aim of this task is to predict the level of psycho-physiological arousal from combined audio-visual signals and the galvanic skin response (also known as Electrodermal Activity signals) of subjects under a highly stress-induced free speech scenario. In the scenarios, the speaker's emotion can be conveyed in different modalities including acoustic, visual, textual, and physiological signal modalities. Due to the complementarity of different modalities, the fusion of the multiple modalities has a large impact on emotion analysis. In this paper, we highlight two aspects of our solutions: 1) we explore various efficient low-level and high-level features from different modalities for this task, 2) we propose two effective multi-modal fusion strategies to make full use of the different modalities. Our solutions achieve the best CCC performance of 0.5728 on the challenge testing set, which significantly outperforms the baseline system with corresponding CCC of 0.4908. The experimental results show that our proposed various effective features and efficient fusion strategies have a strong generalization ability and can bring more robust performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI