姿势
人工智能
估计
班级(哲学)
计算机科学
特征(语言学)
融合
模式识别(心理学)
计算机视觉
工程类
语言学
哲学
系统工程
作者
Huafeng Wang,Haodu Zhang,Wanquan Liu,Weifeng Lv,Xianfeng Gu,Kexin Guo
标识
DOI:10.1016/j.knosys.2024.111918
摘要
Most 6D pose studies often treat RGB and Depth features equally in fusion, potentially limiting model generalization, especially in multi-class tasks. This limitation arises from prevalent static map generation strategies that overlook discriminative features in downsampling sparse point clouds. Additionally, the commonly adopted direct concatenation approach in heterogeneous feature fusion often leads to an averaging effect, thereby reducing the effectiveness of each feature. To tackle these challenges, we propose an effective model for dynamic graph structure feature extraction, aimed at capturing richer features from point clouds. And we introduce an adaptive fusion method for heterogeneous features, which takes into account the unequal contributions to 6D pose estimation. Validation on benchmark datasets LineMOD and YCB-Video demonstrates its effectiveness for multi-class 6D pose estimation, surpassing existing fusion methods. Of particular significance, our method attains state-of-the-art (SOTA) results on the YCB-Video dataset. The code for this study can be accessed at https://github.com/ZEROhands/6D_Pose_Estimate.
科研通智能强力驱动
Strongly Powered by AbleSci AI