人工智能
计算机视觉
计算机科学
RGB颜色模型
图像(数学)
计算机图形学(图像)
作者
Zixun Jiao,Xihan Wang,Jingcao Li,Rongxin Gao,Miao He,Jiao Liang,Zhaoqiang Xia,Quanli Gao
标识
DOI:10.1016/j.patrec.2024.05.019
摘要
We propose a multi-task progressive Transformer framework to reconstruct hand poses from a single RGB image to address challenges such as hand occlusion hand distraction, and hand shape bias. Our proposed framework comprises three key components: the feature extraction branch, palm segmentation branch, and parameter prediction branch. The feature extraction branch initially employs the progressive Transformer to extract multi-scale features from the input image. Subsequently, these multi-scale features are fed into a multi-layer perceptron layer (MLP) for acquiring palm alignment features. We employ an efficient fusion module to enhance the parameter prediction further features to integrate the palm alignment features with the backbone features. A dense hand model is generated using a pre-computed articulated mesh deformed hand model. We evaluate the performance of our proposed method on STEREO, FreiHAND, and HO3D datasets separately. The experimental results demonstrate that our approach achieves 3D mean error metrics of 10.92 mm, 12.33 mm and 9.6 mm for the respective datasets.
科研通智能强力驱动
Strongly Powered by AbleSci AI