模态(人机交互)
语音识别
面部表情
脑电图
计算机科学
听力受损者
人工智能
心理学
听力学
神经科学
医学
作者
Qingzhou Wu,Mu Zhu,Wenhui Xu,Junchi Wang,Zemin Mao,Qiang Gao,Yu Song
标识
DOI:10.1109/tim.2024.3400341
摘要
Currently, most research on emotion recognition primarily relies on single-modal physiological or non-physiological methods, overlooking the complementarity of emotion representation across different modalities. Individuals with hearing impairments may experience emotional cognitive biases due to the loss of the emotional acquisition pathway associated with hearing. Therefore, this study introduces the modality-general and modality-specific (MGMS) learning model, which aims to examine the emotions of hearing-impaired individuals in four categories (fear, happy, neutral, and sad) through the fusion of electroencephalogram (EEG) and facial expression. Specifically, the differential entropy (DE) features are manually extracted from each EEG channel by different brain regions, and then the spatial information is captured by a Long Short-Term Memory (LSTM) network. In terms of facial expression, texture features and geometric features are combined which are extracted by the ResNet network and 68 facial key points, respectively. By constructing a general-specific discriminator, the modality-general and modality-specific features are separated from the two modes. Furthermore, a Transformer encoder is employed to classify the four features using a cross-entropy loss function. Experimental results demonstrate that the proposed methods achieve an average classification accuracy of 86.01% for subject-dependent classification, surpassing the respective accuracies of 65.12% for EEG and 59.86% for facial expressions.
科研通智能强力驱动
Strongly Powered by AbleSci AI