自编码
计算机科学
人工智能
情绪识别
模式识别(心理学)
语音识别
深度学习
作者
Patrick Gao,Tian-Yu Liu,Jiawen Liu,Bao-Liang Lu,Wei‐Long Zheng
标识
DOI:10.1109/icassp48485.2024.10447194
摘要
Emotion recognition is a primary and complex task in emotional intelligence. Due to the complexity of human emotions, utilizing multimodal fusion methods can enhance the performance by leveraging the complementary properties of different modalities. In this paper, we propose a Multimodal Multi-view Spectral-Spatial-Temporal Masked Autoencoder (Multimodal MV-SSTMA) with self-supervised learning to investigate multimodal emotion recognition based on electroencephalogram (EEG) and eye movement signals. Our experimental process comprises three stages: 1) In the pre-training stage, we employ MV-SSTMA to train feature extractors for EEG and eye movement signals; 2) In the fine-tuning stage, the labeled data are input to the feature extractors to fuse and fine-tune the features; 3) In the testing stage, our model is applied to recognize emotions with test data to calculate the accuracies of different methods. Our experimental results demonstrate that the multimodal fusion model outperforms the unimodal model on both SEED-IV and SEED-V datasets. In addition, the proposed model can still effectively recognize emotions with various ratios of missing data. These results underscore the efficiency of multimodal self-supervised learning and data fusion in emotion recognition.
科研通智能强力驱动
Strongly Powered by AbleSci AI