计算机科学
人工智能
模态(人机交互)
特征(语言学)
模式识别(心理学)
模式
语音识别
机器学习
社会科学
哲学
语言学
社会学
作者
Jiehao Tang,Zhuang Ma,Kaiyu Gan,Jianhua Zhang,Zhong Yin
标识
DOI:10.1016/j.inffus.2023.102129
摘要
The lack of complementary affective responses from both the central and peripheral nervous systems could limit the performance of emotion recognition with the single-modal physiological signal. However, when integrating multimodalities, a direct fusion may ignore the heterogeneous nature of multiple feature domains from one modality to another. Besides, there is a risk that the distribution of the multimodal physiological responses may vary across different affective scenarios for stimulating an identical emotional category. The inter-individual variation may also increase due to the superposition of the biometric information from the multimodal features. To tackle these issues, we present a hierarchical multimodal network for robust heterogeneous physiological representations (RHPRNet). First, we applied a spatial-frequency pattern extractor to identify the electroencephalogram (EEG) representations in both the spatial and frequency domains. Next, inter-domain and inter-modality affective encoders are separately applied to the statistic-complexity EEG features and multimodal peripheral features, respectively. All the learned representations are integrated via a hierarchical fusion module. To model the multi-peak patterns stimulated by different affective scenarios, we designed a scenario-adapting pretraining stage. A random contrastive training loss was also applied to mitigate the inter-individual variance. In the end, we performed adequate experiments to examine the performance of the RHPRNet based on three publicly available multimodal databases combined with two validation approaches.
科研通智能强力驱动
Strongly Powered by AbleSci AI