人工智能
计算机科学
计算机视觉
姿势
特征(语言学)
公制(单位)
三维姿态估计
残余物
RGB颜色模型
卷积神经网络
模式识别(心理学)
算法
运营管理
语言学
哲学
经济
作者
Bo You,Mingze Sun,Jiayu Li
标识
DOI:10.1117/1.oe.62.5.054102
摘要
Pose estimation is a typical problem encountered in computer vision. Therefore, it is important to improve the accuracy of this method. Aiming at the accuracy of pose estimation, we propose a high-accuracy pose estimation algorithm based on the you only look once network and residual network with a monocular camera as the sensor for visual acquisition. This algorithm uses ArUco markers as a reference for object localization and uses the red, green, and blue (RGB) image as the input. The input image is sampled 16 and 32 times to extract the feature image, and the feature image extracted by 16 times sampling is passed through the pass-through layer and then combined with the feature image extracted by 32 times sampling to accomplish the dimension expansion. The feature image is identified by the convolutional layer. The EPnP algorithm is used to solve the camera poses. The pose information of the target object in the RGB image is used as the output. By comparing the pose estimation accuracy for the LINEMOD dataset with three evaluation metrics—the 2D projection metric, ADD metric, and 5 cm to 5 deg metric—it can be observed that the pose estimation algorithm proposed has advantages in terms of accuracy compared with traditional pose estimation algorithms. When the target is very similar to the background objects, the algorithm also achieves good performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI