姿势
计算机科学
计算机视觉
人工智能
自动化
偏移量(计算机科学)
三维姿态估计
转化(遗传学)
光学(聚焦)
关节式人体姿态估计
天花板(云)
工程类
物理
光学
基因
化学
机械工程
程序设计语言
结构工程
生物化学
作者
Сонглин Ду,Hao Wang,Zhiwei Yuan,Takeshi Ikenaga
出处
期刊:IEEE Transactions on Automation Science and Engineering
[Institute of Electrical and Electronics Engineers]
日期:2024-01-01
卷期号:: 1-14
被引量:2
标识
DOI:10.1109/tase.2023.3279928
摘要
Automatically estimating 3D human poses in video and inferring their meanings play an essential role in many human-centered automation systems. Existing researches made remarkable progresses by first estimating 2D human joints in video and then reconstructing 3D human pose from the 2D joints. However, mono-directionally reconstructing 3D pose from 2D joints ignores the interaction between information in 3D space and 2D space, losses rich information of original video, therefore limits the ceiling of estimation accuracy. To this end, this paper proposes a bidirectional 2D-3D transformation framework that bidirectionally exchanges 2D and 3D information and utilizes video information to estimate an offset for refining 3D human pose. In addition, a bone-length stability loss is utilized for the purpose of exploring human body structure to make the estimated 3D pose more natural and to further increase the overall accuracy. By evaluation, estimation error of the proposed method, measured by the mean per joint position error (MPJPE), is only 46.5 mm, which is much lower than state-of-the-art methods under the same experimental condition. The improvement on accuracy will make machines to better understand human poses for building superior human-centered automation systems. Note to Practitioners —This paper was motivated by the demand of human-centered automation systems needing to accurately understand human poses. Existing approaches mainly focus on inferring 3D human pose from 2D joints mono-directionally. Although they made remarkable contributions to estimating 3D human pose in such a mono-directional way, we found that they ignore the 2D-3D interaction and do not use original video when inferring 3D pose from 2D joints. This paper therefore suggests a bidirectional 2D-3D transformation that exchanges 2D and 3D information and utilizes video information to estimate more accurate 3D human pose for human-centered automation systems. This work is a pioneering attempt of interactively using 2D and 3D information for more accurate estimation of human pose. Benefited from the state-of-the-art accuracy, the proposed approach is expected to make significant contributions to many human-centered automation systems, such as human-machine interaction, biomimetic manipulation, and automatic surveillance systems.
科研通智能强力驱动
Strongly Powered by AbleSci AI