姿势
公制(单位)
估计员
人工智能
相似性(几何)
计算机科学
模式识别(心理学)
一致性(知识库)
自相关
计算机视觉
数学
图像(数学)
统计
运营管理
经济
作者
Kyoungoh Lee,Woojae Kim,Sanghoon Lee
标识
DOI:10.1109/tpami.2022.3164344
摘要
Predicting a 3D pose directly from a monocular image is a challenging problem. Most pose estimation methods proposed in recent years have shown ‘quantitatively’ good results (below $\sim$ 50 mm ). However, these methods remain ‘perceptually’ flawed because their performance is only measured via a simple distance metric. Although this fact is well understood, the reliance on ‘quantitative’ information implies that the development of 3D pose estimation methods has been slowed down. To address this issue, we first propose a perceptual Pose SIMilarity (PSIM) metric, by assuming that human perception (HP) is highly adapted to extracting structural information from a given signal. Second, we present a perceptually robust 3D pose estimation framework: Temporal Propagating Long Short-Term Memory networks (TP-LSTMs). Toward this, we analyze the information-theory-based spatio-temporal posture correlations, including joint interdependency, temporal consistency, and HP. The experimental results clearly show that the proposed PSIM metric achieves a superior correlation with users’ subjective opinions than conventional pose metrics. Furthermore, we demonstrate the significant quantitative and perceptual performance improvements of TP-LSTMs compared to existing state-of-the-art methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI