Enhanced Trimodal Emotion Recognition Using Multibranch Fusion Attention with Epistemic Neural Networks and Fire Hawk Optimization

融合情绪识别计算机科学人工神经网络认知科学人工智能心理学哲学语言学

作者

Bangar Raju Cherukuri

出处

期刊：Journal of machine and computing 日期：2025-01-03 卷期号：: 058-075

标识

DOI：10.53759/7669/jmc202505005

摘要

Emotions are very crucial for humans as they determine our ways of thinking, our actions, and even how we interrelate with other persons. Recognition of emotions plays a critical role in areas such as interaction between humans and computers, mental disorder detection, and social robotics. Nevertheless, the current emotion recognition systems have issues like noise interference, inadequate feature extraction, and integration of data for the multimodal context that embraces audio, video, and text. To address these issues, this research proposes an "Enhanced Trimodal Emotion Recognition Using Multibranch Fusion Attention with Epistemic Neural Networks and Fire Hawk Optimization." The proposed method begins with modality-specific preprocessing: Natural Language Processing (NLP) for text to address linguistic variations, Relaxed instance Frequency-wise Normalization (RFN) for the audio to minimize distortion of noise’s importance and iterative self-Guided Image Filter (isGIF) for the videos to enhance the image quality and minimize the artifacts. This preprocessing facilitates and optimizes data for feature extracting; an Inception Transformer for capturing the textual contexts; Differentiable Adaptive Short-Time Fourier transform (DA-STFT) to extract the audio's spectral and temporal features; and class attention mechanisms to emphasize important features in the videos. Following that, these features are combined through a Multi-Branch Fusion Attention Network to harmonize all the multifarious modalities into one. The last sanity check occurs through an Epistemic Neural Network (ENN), which tackles issues of uncertainty involved in the last classification, and the Fire Hawk algorithm is used to enhance the emotion recognition capabilities of the framework. Finally the proposed approach attains 99.5% accuracy with low computational time. Thus, the proposed method addresses important shortcomings of the systems developed previously and can be regarded as a contribution to the development of the multimodal emotion recognition field.

求助该文献

Enhanced Trimodal Emotion Recognition Using Multibranch Fusion Attention with Epistemic Neural Networks and Fire Hawk Optimization

今日热心研友