计算机科学
人工智能
棱锥(几何)
网格
计算机视觉
主管(地质)
模式识别(心理学)
面部表情
数学
几何学
地质学
地貌学
作者
Jianyang Zhang,Wei Wang,Xiangyu Li,Yanjiang Han
标识
DOI:10.1016/j.cviu.2024.104010
摘要
Facial Expression Recognition (FER) is garnered considerable interest in the field of computer vision. Being a challenging task, it faces some key problems such as inter-class similarity, intra-class variability, and environment sensitivity. Typically, the traditional Convolutional Neural Networks (CNN) are limited by their locality and thus have difficulty learning long-range dependencies between elements in the image, which leads to decreased performance. A innovative expression analysis system that relies on a pyramid multi-head grid and spatial attention network (PMAN) is presented to address these issues. The PMAN is divided into two stages: the initial feature extraction stage, in which the correlations between various facial zones are learned using Multi-head Grid Attention (MGA), and the deep feature learning stage, in which Multi-head Spatial Attention (MSA) is employed in order to improve the model's global attention to facial features. In addition, a unique feature pyramid design is implemented at the deep feature learning stage to diminish the network's sensitivity to face image size. The experiments show that the PMAN performs significantly not only better than the existing methods in terms of CK+, RAF-DB, FER+, and AffectNet but also achieves 100% accuracy on the CK+ dataset without using pre-trained models.
科研通智能强力驱动
Strongly Powered by AbleSci AI