计算机科学
姿势
频域
人工智能
估计
领域(数学分析)
计算机视觉
模式识别(心理学)
数学
管理
经济
数学分析
作者
Shuren Zhou,Xiaodong Duan,Jiarui Zhou
标识
DOI:10.1016/j.neucom.2024.128318
摘要
Aiming to address the problems of high computing costs and limited local receptive fields in existing human pose estimation methods, this study proposes a novel framework for human pose estimation called "Frequency Domain and Attention Pose Estimation" (FDAPose). By integrating high-resolution features, Fast Fourier Transform (FFT), and attention modules, FDAPose offers a new approach to human pose estimation. This framework improves the accuracy of human pose estimation while reducing computational costs. We introduce the Depthwise ECA Block (DEBlock) and the Residual ECA Block (REBlock) into our backbone network. These modules effectively reduce the number of model parameters and preserve the high-fidelity feature extraction necessary for accurately capturing the spatial relationships and details in human body postures. Additionally, the introduction of the Global Context Coordinate Attention(GCCA) module enhances the model's utilization of contextual information, especially when dealing with occluded and complex backgrounds. Our unique contribution is the integration of spatial features extracted at various stages with the frequency domain information, facilitated by the FFT technique. This approach enhances the model's ability to capture long-distance dependencies within the image, leading to improved accuracy in pose estimation. The model achieves an average precision of 78.0% and 89.6% on the COCO 2017 and MPII datasets, respectively. This study not only improves the accuracy of human pose estimation but also introduces new research avenues to the field by integrating frequency domain and spatial domain information.
科研通智能强力驱动
Strongly Powered by AbleSci AI