单眼
计算机科学
人工智能
稳健性(进化)
计算机视觉
点云
匹配(统计)
像素
模式识别(心理学)
数学
生物化学
化学
统计
基因
作者
D. Gu,Maoteng Zheng,Peiyu Chen,Xiuguo Liu
摘要
Abstract The learning‐based multi‐view stereo (MVS) methods have made remarkable progress in recent years. However, these methods exhibit limited robustness when faced with occlusion, weak or repetitive texture regions in the image. These factors often lead to holes in the final point cloud model due to excessive pixel‐matching errors. To address these challenges, we propose a novel MVS network assisted by monocular prediction for 3D reconstruction. Our approach combines the strengths of both monocular and multi‐view branches, leveraging the internal semantic information extracted from a single image through monocular prediction, along with the strict geometric relationships between multiple images. Moreover, we adopt a coarse‐to‐fine strategy to gradually reduce the number of assumed depth planes and minimise the interval between them as the resolution of the input images increases during the network iteration. This strategy can achieve a balance between the computational resource consumption and the effectiveness of the model. Experiments on the DTU, Tanks and Temples, and BlendedMVS datasets demonstrate that our method achieves outstanding results, particularly in textureless regions.
科研通智能强力驱动
Strongly Powered by AbleSci AI