计算机科学
动作(物理)
单位(环理论)
语音识别
人工智能
计算机视觉
面部表情
模式识别(心理学)
数学
物理
量子力学
数学教育
标识
DOI:10.1109/icassp49357.2023.10096863
摘要
Most of the existing facial action unit detection models seek to improve detection accuracy by utilizing multiple visual modalities, including 3D geometry, thermal, and depth images. However, the potential usage of heterogeneous physiological modalities (e.g., heart rate and blood pressure) for AU detection is not considered in current works. Meanwhile, it's challenging to fully utilize the hidden emotion-correlated physiological signals. In this paper, we propose deep networks to extract temporal features from periodic and non-periodic time-series signals and design an informativeness-based feature fusion module to handle the signal noise. Then, we utilize spatial-temporal visual representations to infer the physiological embeddings, allowing absent physiological data during testing. Experiments show that our multimodal framework achieves state-of-the-art performances on two widely used datasets: MMSE and BP4D.
科研通智能强力驱动
Strongly Powered by AbleSci AI