计算机科学
姿势
单眼
残余物
人工智能
稳健性(进化)
水准点(测量)
成对比较
模式识别(心理学)
计算机视觉
机器学习
算法
基因
地理
化学
生物化学
大地测量学
作者
Bingkun Gao,Zhongxin Zhang,Cui-na Wu,Chenlei Wu,Hongbo Bi
标识
DOI:10.1007/s10489-022-03516-1
摘要
The study of deep end-to-end representation learning for 2D to 3D monocular human pose estimation is a common yet challenging task in computer vision. However, current methods still face the problem that the recognized 3D key points are inconsistent with the actual joint positions. The strategy that trains 2D to 3D networks using 3D human poses with corresponding 2D projections to solve this problem is effective. On this basis, we build a cascaded monocular 3D human pose estimation network, which uses a hierarchical supervision network, and uses the proposed composite residual module (CRM) and enhanced fusion module (EFM) as the main components. In the cascaded network, CRMs are stacked to form cascaded modules. Compared with the traditional residual module, the proposed CRM expands the information flow channels. In addition, the proposed EFM is alternately placed with cascaded modules, which addresses the problems of reduced accuracy and low robustness caused by multi-level cascade. We test the proposed network on the standard benchmark Human3.6M dataset and MPI-INF-3DHP dataset. We compare the results under the fully-supervised methods with six algorithms and the results under the weakly-supervised methods with five algorithms. We use the mean per joint position error (MPJPE) in millimeters as the evaluation index and get the best results.
科研通智能强力驱动
Strongly Powered by AbleSci AI