计算机科学
面部表情
人工智能
判别式
特征学习
模式识别(心理学)
表达式(计算机科学)
面部表情识别
深度学习
特征(语言学)
面部识别系统
语言学
哲学
程序设计语言
作者
Weijun Gong,Yurong Qian,Weihang Zhou,Hongyong Leng
标识
DOI:10.1016/j.bspc.2023.105316
摘要
The recognition of dynamic facial expressions has received increasing attention since they can better reflect the real expression process of emotion than a static image. However, due to various factors such as subtle variation differences, pose, occlusion, and illumination, it has been a challenging vision task to obtain discriminative expression features in dynamic facial expression recognition. Traditional CNN-based deep learning networks lack global and temporal contextual expression understanding, which tends to affect the final recognition of dynamic expressions. Therefore, we propose an enhanced spatial–temporal learning network (ESTLNet) for more robust dynamic facial expression recognition, which consists of a spatial fusion learning module (SFLM) and a temporal transformer enhancement module (TTEM). First, the SFLM obtains a more expressive spatial feature representation through a dual-channel feature fusion learning module. Then, the TTEM extracts more valid temporal contextual expression features based on the above spatial features through an encoder constructed by cascading a self-attention learning network and an effective gated feed-forward network. Finally, the co-enhanced spatial–temporal model approach is assessed on the four broadly used dynamic expression datasets (DFEW, AFEW, CK+, and Oulu-CASIA). Extensive experimental outcomes demonstrate that our approach surpasses several existing state-of-the-art methods, leading to notable enhancements in performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI