姿势
计算机科学
水准点(测量)
人工智能
机器学习
一致性(知识库)
注释
观点
钥匙(锁)
三维姿态估计
模式识别(心理学)
计算机视觉
视觉艺术
艺术
计算机安全
地理
大地测量学
作者
Hyunwoo Kim,Gunhee Lee,Woo-Jeoung Nam,Kyung-Min Jin,Tae-Kyung Kang,Geon-Jun Yang,Seong‐Whan Lee
标识
DOI:10.1016/j.patcog.2023.109908
摘要
Recent advancements in 3D Human Pose Estimation using fully-supervised learning approach have shown impressive results; however, these methods heavily rely on large amounts of annotated 3D data, which are challenging to obtain outside controlled laboratory environments. Therefore, in this study, we propose a new self-supervised training method designed to train a 3D human pose estimation network using unlabeled multi-view images. The model trains relative depths between joints without any 3D annotation by satisfying multi-view consistency constraints from unlabeled multi-view videos without camera calibration, while simultaneously learning representations of multiple plausible pose hypotheses. For this reason, we call our proposed network a Multi-Hypothesis Canonical Lifting Network (MHCanonNet). By enriching the diversity of extracted features and keeping various possibilities open, our network accurately estimates the final 3D pose. The key idea lies in the design of a novel and unbiased reconstruction objective function that combines multiple hypotheses from different viewpoints. The proposed approach demonstrates state-of-the-art results not only on two popular benchmark datasets, Human3.6M and MPI-INF-3DHP but also on an in-the-wild dataset, Ski-Pose, surpassing existing self-supervised training methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI