计算机科学
人工智能
判别式
变压器
模式识别(心理学)
稳健性(进化)
卷积神经网络
特征提取
面部表情
电压
工程类
生物化学
基因
电气工程
化学
作者
Fuyan Ma,Bin Sun,Shutao Li
出处
期刊:IEEE Transactions on Affective Computing
[Institute of Electrical and Electronics Engineers]
日期:2024-01-01
卷期号:: 1-13
被引量:2
标识
DOI:10.1109/taffc.2023.3285231
摘要
Facial expression recognition (FER) in the wild is extremely challenging due to occlusions, variant head poses under unconstrained conditions and incorrect annotations (e.g., label noise). In this paper, we aim to improve the performance of in-the-wild FER with Transformers and online label correction. Different from pure CNNs based methods, we propose a Transformer-augmented network (TAN) to dynamically capture the relationships within each facial patch and across the facial patches. Specifically, the TAN translates a number of facial patch images into a set of visual feature sequences by a backbone convolutional neural network. The intra-patch Transformer is subsequently utilized to capture the most discriminative features within each visual feature sequence. The position-disentangled attention mechanism of the intra-patch Transformer is proposed to better incorporate the positional information for feature sequences. Furthermore, we propose the inter-patch Transformer to model the dependencies across these feature sequences. More importantly, we present the online label correction (OLC) framework to correct suspicious hard labels and accumulate soft labels based on the predictions of the model, which strengthens the robustness of our model against label noise. We validate our method on several widely-used datasets (RAF-DB, FERPlus, AffectNet), realistic occlusion and pose variation datasets, and synthetic noisy datasets. Extensive experiments on these benchmarks demonstrate that the proposed method performs favorably against state-of-the-art methods. The source code will be made publicly available.
科研通智能强力驱动
Strongly Powered by AbleSci AI