端到端原则
计算机科学
人工智能
特征(语言学)
姿势
计算机视觉
哲学
语言学
作者
Mingkun Liu,Guangkun Feng,Fulin Liu,Zhenzhong Wei
标识
DOI:10.1109/icaace61206.2024.10549418
摘要
The 6D pose estimation methods are employed to ascertain the 3D position and 3D orientation of objects through image recognition. The end-to-end pose estimation method is designed to achieve accurate object poses directly. To further improve performance, we propose a learning-based multiple feature guidance network (MFG-Net) for 6D pose regression. This network simultaneously regresses dense 3D coordinate maps, visible segmentation maps, surface region maps, and 2D directional vector maps. By guiding with multiple dense features, we construct dense 2D-3D correspondences to more precisely regress the 6D pose parameters of objects. To enhance the robustness of the network model to tiny distortion or noise in the image, we construct a dual-channel regression framework guided by Gaussian blur to enforce pose consistency and improve the generalization. The skip structures are introduced in the encoder-decoder model to retain detailed information contained in low-level feature maps, thereby enhancing the accuracy of dense feature map predictions. Through improvements in multi-feature guidance, network structure, and data augmentation, we effectively enhance the pose estimation capabilities of the trained network, as evidenced by significant improvements in test results on the LINEMOD dataset.
科研通智能强力驱动
Strongly Powered by AbleSci AI